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INTRODUCTION 

Field of the Invention 

The field of this invention is proteins involved in nerve cell guidance. 

Background 

Bilaterally symmetric nervous systems, such as those found in insects and vertebrates, 
have special midline structures that establish a partition between the two mirror image halves. 
Axons that link the two sides of the nervous system project toward and across the midline, 
forming axon commissures. These commissural axons project toward the midline, at least in 
part, by responding to long-range chemoattractants emanating from the midline. One 
important class of midline chemoattractants are the netrins (Serafini et aL, 1994; Kennedy et 
aL, 1994), guidance signals whose structure, function, and midline expression is evolutionarily 
conserved from nematodes and fruit flies to vertebrates (Hedgecock et aL, 1990; Wadsworth 
et al., 1996; Mitchell et aL, 1996; Harris et aL, 1996). The attractive actions of netrins appear 
to be mediated by growth cone receptors of the DCC subfamily of the immunoglobulin (Ig) 
superfamily (Keino-Masu et al., 1996; Chan et aL, 1996; Kolodziej et al., 1996). 

The midline also provides important short-range guidance signals. This is best 
illustrated by considering the different classes of axon projections in the spinal cord of 
vertebrates or the nerve cord of insects. Although some growth cones extend away from the 
midline, most extend towards or along the midline during some segment of their trajectory. 
Certain classes of growth cones either extend towards the midline or longitudinally along it 
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and yet never cross it. Most growth cones (-90% in the Drosophila CNS), however, do cross 
the midline. After crossing, the majority of these growth cones turn to project longitudinally, 
growing along or near the midline. Interestingly, these axons never cross the midline again, 
despite navigating in the vicinity of other axons that continue to cross. 

What midline signals and growth cone receptors control whether growth cones do or 
do not cross the midline? After crossing once, what mechanism prevents these growth cones 
from crossing again? Studies in the chick (Stoeckli and Landmesser, 1995; Stoeckli et al., 
1997) and grasshopper (Myers and Bastiani, 1993) embryos have led to the suggestion that the 
midline contains a contact-mediated repellent, and that commissural growth cones must 
overcome this repellent to cross the midline. For example, this notion that the midline can be 
repulsive even to growth cones that cross it is supported by time-lapse imaging of the first 
commissural growth cone in the grasshopper embryo. On contacting the midline, this growth 
cone often abruptly retracts, although ultimately it overcomes the repulsion and crosses the 
midline. 

One approach to find the genes encoding the components of such a midline guidance 
system is to screen for mutations in which either too many or too few axons cross the midline. 
Such a large-scale mutant screen was previously conducted in Drosophila and led to the 
identification of two key mutations: commissureless {comm.) and roundabout (robo) (Seeger et 
al., 1993; reviewed by Tear et al., 1993). In comm mutant embryos, commissural growth 
cones initially orient toward the midline but then fail to cross it and instead recoil and extend 
on their own side, comm encodes a novel surface protein expressed on midline cells. As 
commissural growth cones contact and traverse the CNS midline, Comm protein is apparently 
transferred from midline cells to commissural axons (Tear et al., 1996). In robo mutant 
embryos, many growth cones that normally extend only on their own side instead now project 
across the midline, and axons that normally cross the midline only once instead appear to 
cross and recross multiple times (Seeger et al, 1993; Kidd et al., 1997). Double mutants of 
comm and robo display a robo-Mke phenotype. 

Here we disclose the characterization of robo across animal species, robo encodes a 
new class of guidance receptor with 5 Ig domains, 3 fibronectin (FN) type III domains, a 
transmembrane domain, and a long cytoplasmic domain. Robo defines a new subfamily of Ig 
superfamily proteins that is highly conserved from fruit flies to mammals. The results of 
protein expression and transgenic rescue experiments indicate that Robo functions as the 
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gatekeeper controlling midline crossing and that Robo responds to an unknown midline 
repellent. 

SUMMARY OF THE INVENTION 
The invention provides methods and compositions relating to Robol and Robo2, 
collectively Robo) polypeptides, related nucleic acids, polypeptide domains thereof having 
Robo-specific structure and activity, and modulators of Robo function. Robo polypeptides 
can regulate cell, especially nerve cell, function and morphology. The polypeptides may be 
produced recombinantly from transformed host cells from the subject Robo polypeptide 
encoding nucleic acids or purified from mammalian cells. The invention provides isolated 
Robo hybridization probes and primers capable of specifically hybridizing with natural Robo 
genes, Robo-specific binding agents such as specific antibodies, and methods of making and 
using the subject compositions in diagnosis (e.g. genetic hybridization screens for Robo 
transcripts), therapy (e.g. Robo inhibitors to promote nerve cell growth) and in the 
biopharmaceutical industry (e.g. as immunogens, reagents for isolating Robo genes and 
polypeptides, reagents for screening chemical libraries for lead pharmacological agents, etc.). 

BRIEF DESCRIPTION OF THE FIGURES 
Figure 1 Organization of the roundabout Genomic Locus 

(A) Cosmid chromosome walk through the 58F/59A region of the 2nd chromosome. The 
position of deficiency breakpoints within the cosmids used are shown in the top two rows. 
Identified transcripts from the walk are shown below the cosmids. The 12-1 transcript 
corresponds to the robo gene; the direction of transcription is distal to proximal. The location 
of the 16kb Xbal genomic rescue fragment is indicated below. 

(B) Position and size of introns within the robo transcript. Coding sequence is indicated by 
the thicker part of the line. Introns are represented by gaps. The transcript is shown 3V5' to 
reflect its orientation in (A). 

Figure 2 Structure of Robo Protein 

Schematic of the structure of Drosophila Robo protein. The position of the Immunoglobulin 
(Ig), fibronectin (FN) and transmembrane (TM) domains and the amino acid substitution in 
robo 6 are shown. Percent amino acid identity between Drosophila Robo 1 and Human Robo 1 
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is indicated for each domain. 


DETAILED DESCRIPTION OF THE INVENTION 
The nucleotide sequences of exemplary natural cDNAs encoding drosophila 1, 
drosophila 2, C. elegans, human 1, human 2 and mouse 1 Robo polypeptides are shown as 
SEQ ID NOS:l, 3, 5, 7, 9 and 11, respectively, and the full conceptual translates are shown as 
SEQ ID NOS:2, 4, 6, 8, 10 and 12. The Robo polypeptides of the invention include 
incomplete translates of SEQ ID NOS:l, 3, 5, 7, 9 and 1 1 and deletion mutants of SEQ ID 
NOS:2, 4, 6, 8, 10 and 12, which translates and deletion mutants have Robo-specific amino 
acid sequence, binding specificity or function. Preferred translates/deletion mutants comprise 
at least a 6, preferably at least an 8, more preferably at least a 32, most preferably at least a 64 
residue domain of the translates. In a particular embodiment, the deletion mutants comprise 
one or more structural/functional Robo immunoglobulin, fibronectin or cytoplasmic motif 
domains described herein. For example, soluble forms of the disclosed Robo polypeptides 
which comprise one or more Robo IG domains, and especially fusions of two or more Robo 
IG domains, particularly fusions of IG#1 and #2, provide competitive inhibitors of Robo- 
mediated signaling. Exemplary such deletion mutants and recombined deletion mutant 
fusions include human Robo 1 (SEQ ID NO:8) residues 1-67; 68-167; 168-259; 260-350; 351- 
451; 1-167; 1-259; 1-350; 1-451; 68-259; 1-67 joined to 168-259; and 1-67 joined to 260-451. 

Other deletion mutants provide Robo-specific antigens and/or immunogens, especially 
when coupled to carrier proteins as described below. Generic Robo-specific peptides are 
readily apparent as conserved regions in the aligned Robo polypeptide sequences of Table 1. 

Table 1. Sequence Alignment of Robo Family Members: The complete amino acid alignment 
of the predicted Robo proteins encoded by drosophila robo 1 (Dl, SEQ ID NO:2) and Human 
robo 7 (HI, SEQ ID NO: 8) are shown. The extracellular domain of C.elegans robo (CE, SEQ 
ID NO:6; Sax-3; Zallen et al., 1997), the extracellular domain of Drosophila robo 2 (D2, SEQ 
ID NO:4), and partial sequence of Human robo 2 (H2, SEQ ID NO: 10) are also aligned. The 
D2 sequence was predicted by the gene- finder program Grail. The position of 
immunoglobulin domains (Ig), fibronectin domains (FN), the transmembrane domain (TM), 
and conserved cytoplasmic motifs are indicated. The extracellular domain of rat robo 1 is 
nearly identical to HI. 
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mH PMHpENHAIaRSTSTTNNPSrsRSSRMWLlpAWLLLVLVASNGLP 4 7 Dl 

m.FNRKTLlCTi. 11V1QA vlrsFCEDASNlA 30 CE 

mKWKHVP FlVMiSllSl SpNHLFL aQL I PDPEDvE r G . NDHGTP IpTS DNDDNS LGYTGS 5 9 HI 

>IG #1 

AVrGQYQSpriiehpTdlwKknepatlnckVegKpEptiewfkdgepvStn. . EKKshr 105 Dl 

GENpriiehpMdTTvPknDpFtFncQaegNptptiQwfkdgRELKt . . . dTGshr D2 

pViiehpIdVwsRgSpatlncGaK. PStAKiTwykdgQpvItnkEQVNshr 81 CE 

RLrQEDFPpriVehpSdllvskgepatlnckaegRptptiewykGgeRvEtDkDdPRshr 119 HI 

>IG #2 

VQFKDgAlf f YriMQgkkeQ . . dGgEywcvaknRVgQavsrHaslqlavlrddf rvepKd 163 Dl 

iMlpAgGlf f lkvIhSrReS . . dagTywcEakneFgVaRs maTlqy av 1 rdE f rLepAW D2 

iVlDTgslf LlkvNSgkNGKDSdagAyYcvaSneHgeVKsNEGslKLaMlrEdf rvRpRT 141 CE 

MLlpSgslf f IriVhgrkSRP.dEgVyVcvaRnYLgeavsHnaslEvallrddfrQNpSd 17 8 HI 

trvaKgeTallecgppKglpeptLIwIkdgVplddLKAmSFGASSrVrivdggnlLiSNv 22 3 Dl 

trvaQgeValmecgAprgSpepQiswrkNgQTlNL VGNKririvdggnlAiQEA D2 

vQALGgeMavlecSpprgFpepWswrkdDKElRI . QDmP rYTLHSDgnlliDPv 195 CE 

vMvaVgePavmecQpprgHpeptiswKkdgSpldd KDEri . TIRggKlMiTYT 230 HI 

>IG #3 

EPIdEgNyKcIaQnLvgtresSYaKlIvQvkpYfMkepkdqVMLYgQTaTfHcSvggdpP 2 83 Dl 

rQsdDgRyqcvVKnVvgtresATaFlKvHvrpFLIRGpQngtAVvgSsvVfQcrlggdpL D2 

DRsdSgTyqcvaNnmvgerVsNPaRlSvFekpKfEQepkdMtvDvgAAvLfDcrvTgdpQ 2 55 CE 

r Ks dAgKyVc vGTnmvge r e s E VaE 1 TvLe rpS f VkRp S riL AvTvDD s aE f Kc E ARgdp V 2 90 HI 

pKvlwkk. .EEgnlpvsrA RiLHdEKslEiSNItpTdegTyvceaHnNvg 331 Dl 

pDvlwrrTASGgnmpLRKFSWLHSASGRVHVl . EdrslkLDDvtLEdmgeytceaDnAvg D2 

pQITwkr. . KNEPmpvTra YiAKdNrGlRiERvQpSdegeyvcYaRnPAg 3 03 CE 

pTvRwrk . . DDgELp Ks rY Ei . RddHTlkiRKvtAGdmgSytcVaEnMvg 3 3 7 HI 

>IG #4 

QiSaRaSlIvhappNfTKrpSnKKvGlNgVvQLPcMaSgnpPpSvfwTkegVSTlMfpn. 3 88 Dl 

GiTaTGIltvhappKf vlrpKnqLvEIgDEvLf ecQaNgHpRpTLYwsVegNSSllLpGy D2 

TLeasaHlRvqappSfQTkpAdqSvPAggtAtf ecTLVgQpSpaYf wskegQqDllf psy 3 63 CE 

KAeasaTltvqEppHfvVkpRdqVvalgrtvtf QceaTgnpqpalf wRRegsqnllf . sy 3 96 HI 
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qlvaQgrtvtf PceTKgnpqpavf wQkegsqnllf pn. 


SsHGrQYvAADgtlQitDvrqedegyyv. cSaFSwDssTVrVFlQvSS . . vD 440 Dl 

RDGRMEVTLTPEGRSVlSiARFAredSgKVvTcNalnAvgsVSsrTWSvDt . . QF D2 

VSADGRTK. . vsptgtltiEEvrqVdegAyv . cAGMnSagsslskaAlKvttKAvTGNTP 420 CE 

qpPQsSsrFsvsQtgdltitnvqrsdVgyyi.cqTlnvagsilTkaYlevtd. .vIA. . . 450 HI 

qpQQPNsrCsvsptgdltitnlqrsdAgyyi.cqalTvagsilAkaQlevtd. . vLT . H2 


>IG #5 

erpppi iQIgpAnqtlpKgsVaTlpcratgNpSpRiKwFHdgHAvQA . GNRYS i . iqG . . 4 96 Dl 

eLpppiieqgpvnqtlpvKsIWlpcrTLgTpvpQVswYLdglpidVqEHERrNLsDA. . D2 

AKpppTieHgHQnqtlMvgsSallpcQaSgKpTpGiswlRdgLpidlTd . . sri . sqHST 477 CE 

drpppViRqgpvnqtVavdgtFvlScVatgSpvpTiLwRkdgVT,vSTqd. .sriK.qLeN 507 HI 
drpppiiLqgpAnqtlavdgtaLcKcKatgDpLpViswlkEgFTFPGRd. .PrATiq.eQ 


H2 


>FN #1 

SslRVDdlq. lsdSgtytciasGeRgeTswAaTltveKpgs . .TSLHraAdpstypAppg 
gAlTiSdlqrHEdEgLytcvasnRNgKsswsGylRLDTptNpNiKfFrapElstypgppg 
gslHiAdl . kKPdtgVy t c i aKneDge s tws aS 1 1 veDHt sN . Aqf VrMpdp sNFp s SpT 535 CE 
gvlqiR . YAklGdtgRytciasTPsgeatwsaylEvQeFgVp . VqPPrPTdpNLlpsAps 
gTlqiKNl . rlsdtgtytcvaTSSsgeaswsaVlDvTeSgAT . i . . SKNYdlsDLpgpps 


553 Dl 
D2 


565 HI 
H2 


TpKvLnvsrtsISlRwAKSqEKPGAVgpli . gyTVeyf spdlQTgwIVAaHrvGDtQVti 612 Dl 

kpqMvEKGEnsvtl sw . . . TRSNKVggSSLVgyVieMf GKNETDgwVAvGTrvQNttFtQ D2 

QpIIvnvtDtEvElHw. . . NAPSTsgaGpitgyiiQyYspdlgQTwFNIPDYvAStEyRi 592 CE 

kpEvtdvsrnTvtlsw. . . qpNLNsgaTp . tSyiieaf sHASgSswqtvaENvktEtSAi 621 HI 

kpqvtdvtKnsvtlsw. . . qpGTPGTLpA . SAyiieaf sQSVSNswqtvaNHvkttLytV H2 

>FN #2 

SglTpgtsyVflvraenTQgisvpsGLsNViktlEA Df DAASANdlsAarT . HTg 667 Dl 

TglLpgVWyFf liraenSHgLsLpsPMsEpitVGTR Yf NS . . gLdlsEarASllsg D2 

kglkpSHsyMf ViraenEkgiGTpsVSsAXiVttSKPAAQVAlSDKNKMdMAIaEKRlTsE 652 CE 

kglkpnAiylflvraAiiAYgisDpsqlsDpvktQDV lPTSQgVdHKQVQRE . 1GN 675 HI 

RglRpntiylf MvralnPkV . svT . q 


H2 


KSvellDasAinAsavrlEwMLHvSADEkyvegLRiHyK. . DaSVPSAQYHS ITvMDAsa 725 Dl 
DwelSnasvVDstsMKlTwQI . . . iNGkyvegFyVYArQLpNPLNTKyRMLTILNGGGa D2 

QLIKlEEVKTinstavrlFwKKR. . KLEELiDgyyiKWrGPpRTNDNQyW . . .vTSpsT 707 CE 
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*~ AvLHlHnPTvLSsssIEVHwT. . vDQQS Qy i QgyKi Lyr PS GaNHGE SD WLVFE vRTpAK 733 HI 


>FN #3 

esFwGnlKkytKyef fLTpf . . . f ETiegQpsnskTaltYedvpsappDNIQiGmYn . . 780 Dl 

SsCTiTGlVQytLyef f IVpf . . . YKsVegKpsnsRIaRtledvpsEApYgMEALLln . . D2 

eNYwSnlMPFtnyef f VIpYHSGVHsiHgapsnsMDVltAeAPpsLppEDvRiRmlnL . 766 CE 

NsVviPDlRkGVnyelKARpf . . . f NE FQgaD s E I kFaKt 1 eEAp s appQgvTVS KNDGN 790 HI 

Q t aGWvRwTppp S QHHngNl Ygy k i E VS AgnTM KVlAnMtLnaTtTsvLlNnltt 83 5 Dl 

S S aVFLKwkapELKDRHgVl LNyH . vi vRg ID t AHNFSRI 1 TnVt IdaASPTLvl Anl t E D2 

. tTLRIswkapKAdGIngllKgFQiviv . gQAPNNNR nl t TnERAAs vTl FH1 Vt 819 CE 

GtalLvswQpppEdTQngMVQEykV. WCLgnEtR YHInKtVdGStFswIPFlVP 844 HI 


gAVysvrLNSFtKagDgpysKpISlFMdpTHHVHPpRAHPsGTHDGRHEGqDLTYHIsnsrgN 8 95 Dl 

g VMyTvGvaaGNnagvgpy C Vp AT lRl dp I TKRLDp F INQRDHVND D2 

gMTyKIrvAARSnGgvgv ShgTSEVIMNqDTlEKHL . AAQqENESFLYgL 8 6 8 CE 

gIRysvEvaaStGagSgvKsEpQFIQIdAhgNPVSpEDqVslAQQI 890 HI 

> TM < 

iPPGDINPTTHKKTTdYlSGpwLMViVCiVlLvlVisAAIsM.vyFkrkhQmTKElGHLS 954 Dl 

vlTqpwFIiiLgAilavlMLs. . f GAMvFVkrkhMm . . MkQsAL D2 

iNK SHVpVIViVaILiIFvViiIAY.CYwRUS.rNSD. . . gkDRS F 909 CE 

SdvVKqp. .AFiagiGAaCWiiLMVf sIwLyRHrkKR. . NglTsTY 932 HI 

WSDNEIT AlniNSKESL . wIDHHRGwRTADTDKD . . 988 Dl 

AGIRKVPSFTFTPTVTYQRGGEAVSSGGRPGLliliSEPAAQPwLAD . . TwPNTGNNHNDC 990 HI 

SgLsEsKlLSHVNSSQ. . SnynnS DGGtDyAEvd .... TRNL 1024 Dl 

SISCCTAGNgNsDsNlTTYSRPADCIAnynnQLDNKQTNLMLPEStVyGDvdLSNKINEM 10 5 0 HI 

CYTOPLASMIC MOTIF #1 

TtfYNCR KSPDNptpyattMIiGTS sSETCTkT . TSISADkDSGT 1068 Dl 

KtfNSPMLKDGRFVNPSGQptpyattQLiQSNLSNNMNNGsGDSGEkHWKPLGQQkQEVA 1110 HI 

HS Py S DAFAGQVPAVpW . . KSNyLqYPVEP 1097 Dl 

PVQyNIVEQNKLNKDYRANDTVPpTIPYNQSyDqNTGGSYNSSDRGSSTSGSQGHKKGAR 1170 HI 
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CYTOPLASMIC MOTIF #2 

InwSEFlppppEhppp. . .sSTy GyAqGSp 1124 Dl 

TPKVPKQGGMnwADLlppppAhpppHSNsEEyNISVDESyDqEMpCPVPPARMYLQQDEL 12 3 0 HI 

. .eSSRKSSKSAGSglSTNQSILNAsIHsSSSGGFsAWGVSPQYAVAcp 1171 Dl 

EEeEDERGPTPPWgAASSPAAVSYsHQsTATLTPsPQEELQPMLQDcpEETGHMQHQPD 12 90 HI 

pENVy..-SNpl SAVAGGTQNRYQITPTNQHPPQl. . . . 1203 Dl 

RRRQPVSPPPPPRPISpPHTyGYIsGplVSDMDTDAPEEEEDEADMEVAKMQTRRlLLRG 13 5 0 HI 

_ ..pay FATTGPGGAVPPNHLP faTQRHaa 12 3 0 Dl 

LEQTpaSSVGDLESSVTGSMINGWGSASEEDNISSGRSSVSSSDGSFFTDADf aQAVAaa 1410 HI 

SeyQaglNAar cAQSRACMsCdALATPSPmq 12 61 Dl 

Aey.aglKVarRQMQDAAGRRHFHASQcPRPTSPVsTdSNMSAAVmqKTRPAKKLKHQPG 14 69 HI 

CYTOPLASMIC MOTIF #3 

ppppvpVpEGWYQPVHPNSH.PMHpTS.SNHQIYQCSSECsDHSRSsQS 13 07 Dl 

HLRRETYTDDLppppvpPpAIKSPTAQSKTQLEVRpVWPKLPSMDARTDRsSDRKGsSY 152 9 HI 

HKrQL QLEeHGSSAkQrgGHHRRrA . pWQPCMESeN ENM Dl 

KGrEVLDGRQWDMRTNPGDPREAQeQQNDGkGrgNKAAKrDLpPAKTHLIQeDILPYCRPTF HI 

LAEYEQrQYTsDCCNssrEGDTC SCSeGSCl . . yAeAgePAPRQMTAKNT 1395 Dl 

PTSNNPrDPSsSSSMssrGSGSRQREQANVGRRNIAeMQVlGGy . eRgeDNNEELEETES 1651 HI 

Exemplary such Robo specific immunogenic and/or antigenic peptides are shown in Table 2. 

Table 2. Immunogenic Robo polypeptides eliciting Robo-specific rabbit polyclonal antibody: 
Robo polyeptide-KLH conjugates immunized per protocol described below. 
Robo Polypetide. Sequence Tmmunogemcity 
SEQ ID NO:2, residues 68-77 +++ 
SEQ ID NO:2 ? residues 79-94 

SEQ ID NO:2 ? residues 95-103 ++ 
SEQ ID NO:2, residues 122-129 
SEQ ID NO:2 ? residues 165-176 
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+++ 


+++ 
+++ 


SEQ ID NO :2, residues 181-191 +++ 

SEQ ID NO:2, residues 1 93-204 +++ 

SEQ ID NO :2, residues 244-25 1 +++ 

SEQ ID NO:2, residues 274-290 +++ 

SEQ ID NO:2, residues 322-33 1 +++ 

SEQ ID NO:2, residues 339-347 +++ 

SEQ ID NO :2, residues 407-4 1 7 +++ 

SEQ ID NO :2, residues 44 1 -45 1 +++ 

SEQ ID NO:2, residues 453-474 +++ 

SEQ ED NO:2, residues 502-5 16 +++ 

SEQ ID NO:2, residues 541-553 +++ 

SEQ ID NO:2, residues 617-629 +++ 


In addition, species-specific antigenic and/or immunogenic peptides are readily apparent as 
diverged extracellular or cytosolic regions in Table 1. Exemplary such human specific 
peptides are shown in Table 3. 

Table 3. Immunogenic Robo polypeptides eliciting human Robo-specific rabbit polyclonal 

antibody: Robo polyeptide-KLH conjugates immunized per protocol described below (some 

antibodies show cross-reactivity with corresponding mouse/rat Robo polypeptides). 

Robo Polvpetide. Sequence Tmmunogenicitv 

SEQ ID NO:8, residues 1-12 +++ 

SEQ ID NO:8, residues 1 8-28 +++ 

SEQ ID NO: 8, residues 3 1 -40 +++ 

SEQ ED NO:8, residues 45-65 +++ 

SEQ ID NO:8, residues 1 06-1 16 +++ 

SEQ ID NO:8, residues 137-145 +++ 

SEQ ID NO: 8, residues 174-184 +++ 

SEQ ID NO:8, residues 214-230 +++ 


SEQ ID NO:8, residues 274-286 
SEQ ID NO:8, residues 314-324 
SEQ ID NO:8, residues 399-412 


+++ 

+++ 
+++ 
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SEQ ID NO:8, residues 496-507 +++ 

SEQ ID NO:8, residues 548-565 +++ 

SEQ ID NO:8, residues 599-61 1 +++ 

SEQ ID NO:8, residues 660-671 +++ 

SEQ ID NO:8, residues 717-730 +++ 

SEQ ID NO:8, residues 780-791 +++ 

SEQ ID NO:8, residues 835-847 +++ 

SEQ ID NO:8, residues 877-891 +++ 

SEQ ID NO : 8, residues 93 0-942 +++ 

SEQ ID NO:8, residues 98 1 -998 +++ 

SEQ ID NO:8, residues 1040-1051 +++ 

SEQ ID NO:8, residues 1080-1090 +++ 

SEQ ID NO : 8, residues 1 1 54- 1 1 68 +++ 

SEQ ID NO:8, residues 1215-1231 +++ 

SEQ ID NO:8, residues 1278-1302 +++ 


SEQ ID NO:8, residues 1378-1400 
SEQ ID NO: 8, residues 1460-1469 
SEQ ID NO: 8, residues 1497-1519 


SEQ ID NO:8, residues 1606-1626 +++ 

SEQ ID NO:8, residues 1639-1651 +++ 

SEQ ID NO:10, residues 5-16 +++ 

SEQ ID NO : 1 0, residues 38-47 +++ 

SEQ ID NO : 1 0, residues 83-94 +++ 

SEQ ID NO:10, residues 1 12-125 +++ 

SEQ ID NO: 1 0, residues 168-180 +++ 

SEQ ID NO : 1 0, residues 1 95 -209 +++ 

SEQ ID NO: 1 0, residues 222-235 +++ 

SEQ ID NO : 1 0, residues 24 1 -254 +++ 


In a particular embodiment, expressed sequence tags EST;yu23dl 1, Accession 
#H77734 and EST;yq76el2, Accession #H52936, as well as peptides conceptually encoded 
thereby, are not within the scope of the present invention (Tables 4 and 5). In a particular 
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embodiment, the subject Robo polypeptides exclude the corresponding regions of the 
disclosed natural human Robo I polypeptide, i.e. SEQ ID NO:8, residues 168-217 and SEQ ID 
NO:8, residues 1316-1485. 


Table 4 EST:yu23dl 1 sequences compared to H-Robol . yu23dl 1 refers to the fragment of 
DNA which was sequenced. The fragment was sequenced from both ends generating the 
following two sequences: H77734 and H77733. yu23dll is an unspliced cDNA. Only bases 
59-215 match the coding sequence of H-Robol (502-651). The remaining bases are intronic. 
No bases of H77733 match the coding sequence of H-Robol. 

LRDDFRQNPSDVMVAVGEPAVMECQPPRGHPEPTISWKKDGSPLDDKDER H-Robol 
LRDDFRQKPSDVMVAVGEPAVMECQPPRGHPEPTI SWKKDGSPLDDKDER EST H7 7 7 3 4 

There is an error in the sequence, a T to G change which results in the amino acid N being 
replaced by K. The sequence is shown below and has been reversed for clarity: 

TACTTCGGGATGACTTCAGACAAAAACCTTCGGATGTCATGGTTGCAGTA H - Robo 1 
TACTTCGGGATGACTTCAGACAAAACCCTTCGGATGTCATGGTTGCAGTA EST H77734 
LRDDFRQKPSDVMVAV 

N 

Table 5 EST:yq76el2 sequences compared to H-Robol. yq76el2 refers to the fragment of 
DNA which was sequenced. The fragment was sequenced from both ends generating the 
following two sequences: H52936 and H52937 (the latter has been reversed for clarity). The 
sequences can be seen to overlap in the middle. A gap indicates a frameshift error. Note that 
errors only occur in one sequence at any one position. 

GPLVSDMDTDAPEEEEDEADMEVAKMQTRRLLLRGLEQTPASSV H-Robol 
GPLVSDMDTDAPEEEEDEADMEVAKMQT . RLLLRGLEQTPASSV EST H52 93 6 


GDLESSVTGSMINGWGSASEEDNISSGRSSVSSSDGSFFTDADF 
GDLESSVTGSMINGWGSASEEDNISSGRSSVSSSDGSFFTDADF 
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AQAVAAA AEYAGLKVARRQMQDA AGR RHFH AS QC PRPT 
AQAVAAA AEYAGLKVARRQMQDA AGR RHFH AF QC PRPT 
?AAT A ? YAGL KYARRQMRDA AGR RHFH AS QC PRPT 


H-Robol 
EST H52936 
EST H52937 


S P VS TD SNMS AAVMQKTRPAKKLKHQ PGHLRRET YTDDLP P P P V H - Robo 1 


SPVFTDSNM 


EST H52936 


SPVSTDSNMSAAVMQKTRPAKKLKHQPGHLRRETYTDDLPPPPV EST H52 93 7 


PPPAIKSPTAQSKTQLEVRPWVPKLPSMDARTDK 


H-Robol 


PPPAIKSPTAQSKTQLEVRPVWPKLPSMDARTDK 


EST H52937 


The subject domains provide Robo domain specific activity or function, such as 
Robo-specific cell, especially neuron modulating or modulating inhibitory activity, Robo- 
ligand-binding or binding inhibitory activity. Robo-specific activity or function may be 
determined by convenient in vitro, cell-based, or in vivo assays: e.g. in vitro binding assays, 
cell culture assays, in animals (e.g. gene therapy, transgenics, etc.), etc. Binding assays 
encompass any assay where the molecular interaction of a Robo polypeptide with a binding 
target is evaluated. The binding target may be a natural intracellular binding target, a Robo 
regulating protein or other regulator that directly modulates Robo activity or its localization; 
or non-natural binding target such as a specific immune protein such as an antibody, or a Robo 
specific agent such as those identified in screening assays such as described below. Robo- 
binding specificity may be assayed by binding equilibrium constants (usually at least about 
K^M" 1 , preferably at least about 10 8 M" 1 , more preferably at least about 10 9 M" 1 ), by the ability 
of the subject polypeptide to function as negative mutants in Robo-expressing cells, to elicit 
Robo specific antibody in a heterologous host (e.g a rodent or rabbit), etc. 

The claimed Robo polypeptides are isolated or pure: an "isolated" polypeptide is 
unaccompanied by at least some of the material with which it is associated in its natural state, 
preferably constituting at least about 0.5%, and more preferably at least about 5% by weight 
of the total polypeptide in a given sample and a pure polypeptide constitutes at least about 
90%, and preferably at least about 99% by weight of the total polypeptide in a given sample. 
A polypeptide, as used herein, is a polymer of amino acids, generally at least 6 residues, 
preferably at least about 10 residues, more preferably at least about 25 residues, most 
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preferably at least about 50 residues in length. The Robo polypeptides and polypeptide 
domains may be synthesized, produced by recombinant technology, or purified from 
mammalian, preferably human cells. A wide variety of molecular and biochemical methods 
are available for biochemical synthesis, molecular expression and purification of the subject 
compositions, see e.g. Molecular Cloning, A Laboratory Manual (Sambrook, et al Cold 
Spring Harbor Laboratory), Current Protocols in Molecular Biology (Eds. Ausubel, et aL, 
Greene Publ. Assoc., Wiley-Interscience, NY) or that are otherwise known in the art. 

The invention provides binding agents specific to the claimed Robo polypeptides, 
including natural intracellular binding targets, etc., methods of identifying and making such 
agents, and their use in diagnosis, therapy and pharmaceutical development. For example, 
specific binding agents are useful in a variety of diagnostic and therapeutic applications, 
especially where pathology, wound repair incompetency or prognosis is associated with 
improper or undesirable axon outgrowth, orientation or inhibition thereof. Novel Robo- 
specific binding agents include Robo-specific receptors, such as somatically recombined 
polypeptide receptors like specific antibodies or T-cell antigen receptors (see, e.g Harlow and 
Lane (1988) Antibodies, A Laboratory Manual, Cold Spring Harbor Laboratory), natural 
intracellular binding agents identified with assays such as one-, two- and three-hybrid screens, 
non-natural intracellular binding agents identified in screens of chemical libraries such as 
described below, etc. Agents of particular interest modulate Robo function. 

In a particular embodiment, the subject polypeptides are used to generate Robo- or 
human Robo-specific antibodies. For example, the Robo- and human Robo-specific peptides 
described above are covalently coupled to keyhole limpet antigen (KLH) and the conjugate is 
emulsified in Freunds complete adjuvant. Laboratory rabbits are immunized according to 
conventional protocol and bled. The presence of Robo-specific antibodies is assayed by solid 
phase immunosorbant assays using immobilized Robo polypeptides of SEQ ID NO:2, 4, 6, 8, 
10 or 12. Human Robo-specific antibodies are characterized as uncross-reactive with non- 
human Robo polypeptides (SEQ ID NOS:2, 4, 6 and 12). 

Accordingly, the invention provides methods for modulating cell function comprising 
the step of modulating Robo activity, e.g. by contacting the cell with a Robo inhibitor, e.g. 
inhibitory Robo deletion mutants, Robo-specific antibodies, etc. (supra). The target cell may 
reside in culture or in situ, i.e. within the natural host. The inhibitor may be provided in any 
convenient way, including by (i) intracellular expression from a recombinant nucleic acid or 
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(ii) exogenous contacting of the cell For many in situ applications, the compositions are 
added to a retained physiological fluid such as blood or synovial fluid. For CNS 
administration, a variety of techniques are available for promoting transfer of the therapeutic 
across the blood brain barrier including disruption by surgery or injection, drugs which 
transiently open adhesion contact between CNS vasculature endothelial cells, and compounds 
which facilitate translocation through such cells. Robo polypeptide inhibitors may also be 
amenable to direct injection or infusion, topical, intratracheal/nasal administration e.g. through 
aerosol, intraocularly, or within/on implants e.g. fibers e.g. collagen, osmotic pumps, grafts 
comprising appropriately transformed cells, etc. A particular method of administration 
involves coating, embedding or derivatizing fibers, such as collagen fibers, protein polymers, 
etc. with therapeutic proteins. Other useful approaches are described in Otto et al. (1989) J 
Neuroscience Research 22, 83-91 and Otto and Unsicker (1990) J Neuroscience 10, 1912- 
192 1 . Generally, the amount administered will be empirically determined, typically in the 
range of about 10 to 1000 jag/kg of the recipient and the concentration will generally be in the 
range of about 50 to 500 \xg/ml in the dose administered. Other additives may be included, 
such as stabilizers, bactericides, etc. will be present in conventional amounts. For diagnostic 
uses, the inhibitors or other Robo binding agents are frequently labeled, such as with 
fluorescent, radioactive, chemiluminescent, or other easily detectable molecules, either 
conjugated directly to the binding agent or conjugated to a probe specific for the binding 
agent. 

The amino acid sequences of the disclosed Robo polypeptides are used to back- 
translate Robo polypeptide-encoding nucleic acids optimized for selected expression systems 
(Holler et al. (1993) Gene 136, 323-328; Martin et al. (1995) Gene 154, 150-166) or used to 
generate degenerate oligonucleotide primers and probes for use in the isolation of natural 
Robo-encoding nucleic acid sequences ("GCG" software, Genetics Computer Group, Inc, 
Madison WI). Robo-encoding nucleic acids used in Robo-expression vectors and 
incorporated into recombinant host cells, e.g. for expression and screening, transgenic 
animals, e.g. for functional studies such as the efficacy of candidate drugs for disease 
associated with Robo-modulated cell function, etc. 

The invention also provides nucleic acid hybridization probes (Tables 6, 7) and 
replication / amplification primers (Tables 7, 8) having a Robo cDNA specific sequence 
comprising SEQ ID NO:l, 3, 5, 7, 9 or 1 1 and sufficient to effect specific hybridization 
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thereto (i.e. specifically hybridize with SEQ ID NO:l, 3, 5, 7, 9 or 1 1 , respectively, in the 
presence of CDO cDNA. 

Table 5. Hybridisation Probes for Human Roundabout 1 
Immunoglobulin Domain #1 

CCACCTCGCATTGTTGAACACCCTTCAGACCTGATTGTCTCAAAAGGAGAACCTGCAACTTTGAACTGCAAAGCT 
GAAGGCCGC C C CACAC C CACTATTGAATGGTACAAAGGGGGAGAGAGAGTGGAGACAGACAAAGATGACCC TCGC 
TCACACCGAATGTTGCTGCCGAGTGGATCTTTATTTTTCTTACGTATAGTACATGGACGGAAAAGTAGACCTGAT 
GAAGGAGT CT ATGT CTGTGTAGC AAGGAATTAC CTTGGAGAGGCTGTGAGCCACAATGCATCGCTGGAAGTAGC C 
ATA 

Immunoglobulin Domain#2 

CTTCGGGATGACTTCAGACAAAACCCTTCGGATGTCATGGTTGCAGTAGGAGAGCCTGCAGTAATGGAATGCCAA 
CCTCCACGAGGCCATCCTGAGCCCACCATTTCATGGAAGAAAGATGGCTCTCCACTGGATGATAAAGATGAAAGA 
ATAACTATACGAGGAGGAAAGCTCATGATCACTTACACCCGTAAAAGTGACGCTGGCAAATATGTTTGTGTTGGT 
ACCAATATGGTTGGGGAACGTGAGAGTGAAGTAGCCGAGCTGACTGTCTT 

Immunoglobulin Domain #3 

AGAGAGAC CAT CATTTGTGAAGAGAC C CAGTAACTTGGCAGTAACTGTGGATGAC AGTGCAGAATTTAAATGTGA 
GGC C CGAGGTGACCCTGTACCTACAGTACGATGGAGGAAAGATGATGGAGAGCTGC C CAAATC CAGATATGAAAT 
CCGAGATGATCATACCTTGAAAATTAGGAAGGTGACAGCTGGTGACATGGGTTCATACACTTGTGTTGCAGAAAA 
TATGGTGGGCAAAGCTGAAGCATCTGCTACTCTGACTGTTCAAGAACC 

Immunoglobulin Domain #4 

CCACATTTTGTTGTGAAACCCCGTGACCAGGTTGTTGCTTTGGGACGGACTGTAACTTTTCAGTGTGAAGCAACC 
GGAAATCCTCAACCAGCTATTTTCTGGAGGAGAGAAGGGAGTCAGAATCTACTTTTCTCATATCAACCACCACAG 
TCATCCAGCCGATTTTCAGTCTCCCAGACTGGCGACCTCACAATTACTAATGTCCAGCGATCTGATGTTGGTTAT 
TACATCTGCCAGACTTTAAATGTTGCTGGAAGCATCATCACAAAGGCATATTTGGAAGTTACAGATGTGATTGCA 

Immunoglobulin Domain #5 

GATCGGCCTCCCCCAGTTATTCGACAAGGTCCTGTGAATCAGACTGTAGCCGTGGATGGCACTTTCGTCCTCAGC 
TGTGTGGCCACAGGCAGTCCAGTGCCCACCATTCTGTGGAGAAAGGATGGAGTCCTCGTTTCAACCCAAGACTCT 
CGAATCAAACAGTTGGAGAATGGAGTACTGCAGATCCGATATGCTAAGCTGGGTGATACTGGTCGGTACACCTGC 
ATTGCATCAACCCCCAGTGGTGAAGCAACATGGAGTGCTTACATTGAAGTTCAAGAATTTG 
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Fibronectin Domain #1 

GAGTTCCAGTTCAGCCTCCAAGACCTACTGACCCAAATTTAATCCCTAGTGCCCCATCAAAACCTGAAGTGACAG 
ATGTCAGCAGAAATACAGTCACATTATCGTGGCAACCAAATTTGAATTCAGGAGCAACTCCAACATCTTATATTA 
TAGAAGCCTTCAGCCATGCATGTGGTAGCAGCTGGCAGAGCGTAGCAGAGAATGTGAAAACAGAAACATCTGCCA 
TTAAAGGACTCAAACCTAATGCAATTTACCTTTTCCTTGTGAGGGCAGCTAATGCATATGGAATTAGTGATC 

Fibronectin Domain #2 

CAAGCC AAATATCAGATC CAGTGAAAAC ACAAGATGTC CTAC CAAC AAGTCAGGGGGTGGAC C AC AAGCAGGTCC 
AGAGAGAGCTGGGAAATGCTGTTCTGCACCTCCACAACCCCACCGTCCTTTCTTCCTCTTCCATCGAAGTGCACT 
GGACAGTAGATCAAGAGTCTCAGTATATACAAGGATATAAAATTCTCTATCGGCCATCTGGAGCCAACCACGGAG 
AATCAGACTGGTTAGTTTTTGAAGTGAGGACGCCAGCCAAAAACAGTGTGGTAATCCCTGATCTCAGAAAGGGAG 
TCAACTATGAAATTAAGGCTCGCCCTTTTTTTAATGAATTTCAAGGAGCAG 

Fibronectin Domain #3 

ATAGTGAAATCAAGTTTGCCAAAACCCTGGAAGAAGCACCCAGTGCCCCACCCCAAGGTGTAACTGTATCCAAGA 
ATGATGGAAACGGAACTGCAATT CTAGTTAGTTGGCAGC CAC CTC C AGAAGAC ACT CAAAATGGAATGGT C CAAG 
AGTATAAGGTTTGGTGTCTGGGCAATGAAACTCGATACCACATCAACAAAACAGTGGATGGTTCCACCTTTTCCG 
TGGTCATTCCCTTTCTTGTTCCTGGAATCCGATACAGTGTGGAAGTGGCAGCCAGCACTGGGGCTGGGTCTGGGG 

TAAAG 

Transmembrane Domain 

AGATTTCAGATGTGGTGAAGCAGCCGGCCTTCATAGCAGGTATTGGAGCAGCCTGTTGGATCATCCTCATGGTCT 
TCAGCATCTGGCTTTATCGACACCG 

Cytoplasmic Motif #1 

AATCTGAAGGATGGGCGTTTTGTCAATCCATCAGGGCAGCCTACTCCTTACGCCACCACTCAGCTCATCCAGTCA 
AACCTCAGCAACAACATGAACAATG 

Cytoplasmic Motif #2 

CCCAAGGTACCAAAACAGGGTGGCATGAACTGGGCAGACCTGCTTCCTCCTCCCCCAGCACATCCTCCTCCACAC 
AGCAATAGCGAAGAGTACAACATTT 

Cytoplasmic Motif #3 

CCAGCCAGGACATCTGCGCAGAGAAACCTACACAGATGATCTTCCACCACCTCCTGTGCCGCCACCTGCTATAAA 
GTCACCTACTGCCCAATCCAAGACA 
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Table 6. Hybridisation Probes for Human Roundabout 2 
Immunoglobulin Domain #4 

CAGATTGTTGCTCAAGGTCGAACAGTGACATTTCCCTGTGAAACTAAAGGAAACCCACAGCCAGCTGTTTTTTGG 
CAGAAAGAAGGCAGCCAGAACCTACTTTTCGCAAACCAACCCCAGCAGCCCAACAGTAGATGCTCAGTGTCACCA 
ACTGGAGACCTCACAATCACCAACATTCAACGTTCCGACGCGGGTTACTACATCTGCCAGGCTTTAACTGTGGCA 
GGAAGCATTTTAGCAAAAGCTCAACTGGAGGTTACTGATGTTTTGACA 

Immunoglobulin Domain #5 

GATAGACCTCCACCTATAATTCTACAAGGGCCAGCCAACCAAACGCTGGCAGTGGATGGTACAGCGTTACTGAAA 
TGTAAAGCCACTGGTGATCCTCTTCCTGTAATTAGCTGGTTAAAGGAGGGATTTACTTTTCCGGGTAGAGATCCA 
AGAGCAACAATTCAAGAGCAAGGCACACTGCAGATTAAGAATTTACGGATTTCTGATACTGGCAGTTATACTTGT 
GTGGCTACAAGTTCAAGTGGAGAGGCTTCCTGGAGTGCAGTGCTGGATGTGACAGAGTCT 

Fibronectin Domain #1 

GGAGCAACAATCAGTAAAAACTATGATTTAAGTGACCTGCCAGGGCCACCATCCAAACCGCAAGTCACTGATGTT 
ACTAAGAACAGTGTCACCTTGTGCTGGCAGCCAGGTACCGCTGGAACCCTTCCAGCAAGTGCATATATCATTGAG 
GCTTTCAGCCAATCAGTGAGCAACAGCTGGCAGACCGTGGCAAACCATGTAAAGACCACCCTCTATACTGTAAGA 
GGACTGCGGCCCAATACAATCTACTTATTCATGGTCAGAGCGATCAACCCCAAGGTYTCAGTGACCCAAGT 

Table 7. Primer Pairs for PCR of Human Roundabout 1 Domains 
Immunoglobulin Domain #1 

Forward: 5' CCACCTCGCATTGTTGAACACCCTTCAGAC 3' 
Reverse: 5 1 ATGGCTACTTCCAGCGATGCATTGTGGCTC 3' 

Immunoglobulin Domain #2 

Forward: 5' CTTCGGGATGACTTCAGACAAAACCCTTCG 3' 
Reverse: 5' TAAGACAGTCAGCTCGGCTACTTCACTCTC 3' 

Immunoglobulin Domain #3 

Forward: 5' AGAGAGACCATCATTTGTGAAGAGACCCAG 3' 
Reverse: 5' AGGTTCTTGAACAGTCAGAGTAGCAGATGC 3' 

Immunoglobulin Domain #4 

Forward: 5' CCACATTTTGTTGTGAAACCCCGTGACCAG 3' 
Reverse: 5' TGCAATCACATCTGTAACTTCCAAATATGC 3' 
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Immunoglobulin Domain #5 

Forward: 5 1 ATCGGCCTCCCCCAGTTATTCGACAAGGTC 3 T 
Reverse: 5 T CAAATTCTTGAACTTCAATGTAAGCACTCC 3' 

Fibronectin Domain #1 

Forward: 5' GAGTTCCAGTTCAGCCTCCAAGACCTACTG 3' 
Reverse: 5' TCACTAATTCCATATGCATTAGCTGCGCTC 3 1 

Fibronectin Domain #2 

Forward : 5 1 CAAGCCAAATATCAGATCCAGTGAAAACAC 3 1 
Reverse : 5 ' ATCTGCTCCTTGAAATTCATTAAAAAAAGG 3 ' 

Fibronectin Domain #3 

Forward: 5' ATAGTGAAATCAAGTTTGCCAAAACCCTG 3' 
Reverse: 5' CTCTTTACCCCAGACCCAGCCCCAGTGCTG 3' 

Transmembrane Domain 

Forward: 5' GGACCAAGTCAGCCTCGCTCAGCAGATTTC 3' 
Reverse: 5' ACTAGTAAGTCCGTTTCTCTTCTTGCGGTG 3' 

Cytoplasmic Motif #1 

Forward: 5 T CTGAAGGATGGGCGTTTTGTCAATCCATC 3' 
Reverse: 5' GTCCCAGTGGTTTCCAGTGCTTCTCGCCAG 3' 

Cytoplasmic Motif #2 

Forward: 5' GGCAC AAGAAAGGGGCAAGAAC ACC C AAGG 3' 
Reverse: 5' ATAGCTTTCATCTACAGAAATGTTGTACTC 3 1 

Cytoplasmic Motif #3 

Forward: 5' ACGAGACCAGCCAAGAAACTGAAACACCAG 3' 
Reverse: 5' GTACTTCCAGCTGTGTCTTGGATTGGGCAG 3 1 


Table 8. Human Roundabout 2 Primer Pairs 
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Immunoglobulin Domain #4 

Forward : 5 1 GTTGCTCAAGGTCGAACAGTGACATTTCCC 3 ' 
Reverse: 5' TGTCAAAACATCAGTAACCTCCAGTTGAGC 3' 

Immunoglobulin Domain #5 

Forward : 5 T GATAGACCTCCACCTATAATTCTACAAGGC 3 ' 
Reverse: 5' GACTCTGTCACATCCAGCACTGCACTCGAG 3' 

Fibronectin Domain #1 

Forward : 5 1 CAATCAGTAAAAACTATGATTTAAGTG 3 ' 
Reverse: 5' TCGCTCTGACCATGAATAAGTAGATTG 3' 

Such primers or probes are at least 12, preferably at least 24, more preferably at least 36 and 
most preferably at least 96 bases in length. Demonstrating specific hybridization generally 
requires stringent conditions, for example, hybridizing in a buffer comprising 30% formamide 
in 5 x SSPE (0.18 M NaCl, 0.01 M NaP0 4 , pH7.7, 0.001 M EDTA) buffer at a temperature of 
42°C and remaining bound when subject to washing at 42°C with 0.2 x SSPE; preferably 
hybridizing in a buffer comprising 50% formamide in 5 x SSPE buffer at a temperature of 
42°C and remaining bound when subject to washing at 42°C with 0.2 x SSPE buffer at 42°C. 
Robo nucleic acids can also be distinguished using alignment algorithms, such as BLASTX 
(Altschul et al (1990) Basic Local Alignment Search Tool, J Mol Biol 215, 403-410). 

The subject nucleic acids are of synthetic/non-natural sequences and/or are isolated, 
i.e. unaccompanied by at least some of the material with which it is associated in its natural 
state, preferably constituting at least about 0.5%, preferably at least about 5% by weight of 
total nucleic acid present in a given fraction, and usually recombinant, meaning they comprise 
a non-natural sequence or a natural sequence joined to nucleotide(s) other than that which it is 
joined to on a natural chromosome. The subject recombinant nucleic acids comprising the 
nucleotide sequence of SEQ ID NO:l, 3, 5, 7, 9 or 11, or fragments thereof, contain such 
sequence or fragment at a terminus, immediately flanked by (i.e. contiguous with) a sequence 
other than that which it is joined to on a natural chromosome, or flanked by a native flanking 
region fewer than 10 kb, preferably fewer than 2 kb, more preferably fewer than 500 bp, 
which is at a terminus or is immediately flanked by a sequence other than that which it is 
joined to on a natural chromosome. While the nucleic acids are usually RNA or DNA, it is 
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often advantageous to use nucleic acids comprising other bases or nucleotide analogs to 
provide modified stability, etc. 

In a particular embodiment, expressed sequence tags EST;yu23dl 1, Accession 
#H77734 and EST;yq76el2, Accession #H52936, and deletion mutants thereof, are not within 
the scope of the present invention. In another embodiment, the subject Robo nucleic acids 
exclude the corresponding regions of the disclosed natural human Robo I nucleic acids, i.e. 
SEQ ID NO:7, nucleotides 500-651 and SEQ ID NO:7, nucleotides 3945-4455. 

Table 10. Exemplary differences between H52936 and corresponding human Robo I 
sequences. 

(1) At position 86, there is a T instead of an A. The new codon therefore reads TGA (Stop) 
instead of AGA(R). 

(2) There is a missing G at position 286-7, causing a frameshift. 

(3) There is an extra G at position 334, causing a frameshift. 

(4) There is an extra T at position 344, causing a frameshift. 

(5) There is an extra N at position 357, causing a frameshift. 

(6) There is a T instead of a C at 362. The new codon reads TTT (F) instead of TCT (S). 

(7) There is an extra T at position 364, causing a frameshift. 

(8) There is an extra N at position 370, causing a frameshift and a changed amino acid (the 
codon TTN is ambiguous). 

(9) There are two Ts at position 394 and 395 instead of a C, causing a frameshift and amino 
acid changes. 

Table 11 . Exemplary differences between H52937 (reverse sequence) and corresponding 
human Robo I sequences. 

(1) There are multiple errors in the first 30 bases. 

(2) At position 63, a G replaces an A. The new codon CGG codes for R instead of CAG for Q. 

(3) The EST ends by joining to part of the human glycophorin B gene (353-442) 

The subject nucleic acids find a wide variety of applications including use as 
translatable transcripts, hybridization probes, PCR primers, diagnostic nucleic acids, etc.; use 
in detecting the presence of Robo genes and gene transcripts and in detecting or amplifying 
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nucleic acids encoding additional Robo homologs and structural analogs. In diagnosis, Robo 
hybridization probes find use in identifying wild-type and mutant Robo alleles in clinical and 
laboratory samples. Mutant alleles are used to generate allele-specific oligonucleotide (ASO) 
probes for high-throughput clinical diagnoses. In therapy, therapeutic Robo nucleic acids are 
used to modulate cellular expression or intracellular concentration or availability of active 
Robo. 

The invention provides efficient methods of identifying agents, compounds or lead 
compounds for agents active at the level of a Robo modulatable cellular function. Generally, 
these screening methods involve assaying for compounds which modulate 
Robo interaction with a natural Robo binding target. A wide variety of assays for binding 
agents are provided including labeled in vitro protein-protein binding assays, immunoassays, 
cell based assays, etc. The methods are amenable to automated, cost-effective high 
throughput screening of chemical libraries for lead compounds. Identified reagents find use in 
the pharmaceutical industries for animal and human trials; for example, the reagents may be 
derivatized and rescreened in in vitro and in vivo assays to optimize activity and minimize 
toxicity for pharmaceutical development. 

Cell and animal based neural guidance/repulsion assays are described in detail in the 
experimental section below. In vitro binding assays employ a mixture of components 
including a Robo polypeptide, which may be part of a fusion product with another peptide or 
polypeptide, e.g. a tag for detection or anchoring, etc. The assay mixtures comprise a natural 
intracellular Robo binding target. While native full-length binding targets may be used, it is 
frequently preferred to use portions (e.g. peptides) thereof so long as the portion provides 
binding affinity and avidity to the subject Robo polypeptide conveniently measurable in the 
assay. The assay mixture also comprises a candidate pharmacological agent. Candidate agents 
encompass numerous chemical classes, though typically they are organic compounds; 
preferably small organic compounds and are obtained from a wide variety of sources 
including libraries of synthetic or natural compounds. A variety of other reagents may also be 
included in the mixture. These include reagents like salts, buffers, neutral proteins, e.g. 
albumin, detergents, protease inhibitors, nuclease inhibitors, antimicrobial agents, etc. may be 
used. 

The resultant mixture is incubated under conditions whereby, but for the presence of 
the candidate pharmacological agent, the Robo polypeptide specifically binds the cellular 
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binding target, portion or analog with a reference binding affinity. The mixture components 
can be added in any order that provides for the requisite bindings and incubations may be 
performed at any temperature which facilitates optimal binding. Incubation periods are 
likewise selected for optimal binding but also minimized to facilitate rapid, high-throughput 
screening. 

After incubation, the agent-biased binding between the Robo polypeptide and one or 
more binding targets is detected by any convenient way. Where at least one of the Robo or 
binding target polypeptide comprises a label, the label may provide for direct detection as 
radioactivity, luminescence, optical or electron density, etc. or indirect detection such as an 
epitope tag, etc. A variety of methods may be used to detect the label depending on the 
nature of the label and other assay components, e.g. through optical or electron density, 
radiative emissions, nonradiative energy transfers, etc. or indirectly detected with antibody 
conjugates, etc. 

A difference in the binding affinity of the Robo polypeptide to the target in the absence 
of the agent as compared with the binding affinity in the presence of the agent indicates that 
the agent modulates the binding of the Robo polypeptide to the Robo binding target. For 
example, in the cell-based assay also described below, a difference in Robo-dependent 
modulation of axon outgrowth or orientation in the presence and absence of an agent indicates 
the agent modulates Robo function. A difference, as used herein, is statistically significant 
and preferably represents at least a 50%, more preferably at least a 90% difference. 

The following experimental section and examples are offered by way of illustration 
and not by way of limitation. 

EXPERIMENTAL 

Cloning of the roundabout Gene. The robo 1 allele was mapped to the plexus-brown 
interval on the right arm of the second chromosome by recombination mapping; the numbers 
of recombinants suggested a map position very close to plexus at 58F/59A. One deficiency 
[Df(2R)P 9 which deletes 58E3/F1 through 60D14/E2] fails to complement robo mutations, 
two other deficiencies [Df(2R)59AB and Df(2R)59AD, which delete 59A1/3 through 59B1/2 
and 59A1/3 through 59D1/4 respectively] do complement robo, and a duplication 
[Dp(2;Y)bw + Y, which duplicates 58F1/59A2 through 60E3/F1] rescues robo mutations. This 
mapping places robo in the 58F/59A region. 
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We initiated chromosomal walks from PI clones mapped to the region, beginning 
from the distal side using clone DS02204 and from the proximal side using clone DS05609. 
We used cosmid clones (Tamkun et al M 1992) to complete a walk of -150 kb. We then looked 
for RFLPs in the recombinants between the multiple marked chromosome and the robo 
mutant chromosome. A 6,8kb EcoRI fragment from cosmid 106-5 identified a Hindll RFLP 
on the mapping chromosome that was present on a single robo mutant recombinant line. This 
fragment identified a proximal limit for the location of robo. Further deficiencies in this 
region were then tested (Kerrebrock et al., 1995). Of these deficiencies, Df(2R)X58-5 and 
Df(2R)X58-12 remove robo while Df(2R)X58-l does not. Df(2R)X58~12 fails to complement 
Df(2R)59AB yet complements Df(2R)59AD indicating that Df(2R) 5 9AB extends further 
proximal; this proximal endpoint provides a distal limit for the location of robo. Probes from 
the walk were used to identify the breakpoints of these deficiencies (Figure 1 A). Df(2R)X58-l 
breaks in a 9.6 kb EcoRI/BamHI fragment within cosmid GJ12, whereas Df(2R) 59AB breaks 
in a 8 kb BamHI/EcoRI fragment within cosmid 106-1435. This reduces the location of robo 
to a 75 kb region bounded by these restriction fragments. Hybridization of 0-16 hr poly-A + 
embryonic Northern blots with cosmids GJ12, 106-12, and 106-1435 revealed at least five 
transcripts. Reverse Northern mapping identified the regions containing these transcripts 
(Figure 1 A). These regions were used as probes to isolate cDNAs. Seven different cDNAs 
were isolated and analyzed by in situ hybridization. The expression pattern of five of these 
transcripts allowed us to tentatively discount them as encoding for robo since they were not 
expressed in the embryonic CNS at the appropriate stage. Of the two cDNAs remaining, 12-1 
appeared by its size and expression the most likely candidate for robo. A 16 kb Xbal 
fragment including the 12-1 transcript and a region 5' to the transcript is capable of rescuing 
the robo mutant. 

roundabout Encodes a Member of the Immunoglobulin Superfamily. We recovered 
and sequenced overlapping cDNA clones corresponding to the 12-1 transcription unit. A 
single long open reading frame (ORF) that encodes 1395 amino acids was identified (Dl in 
Table 1). Conceptual translation of the ORF reveals the Robo protein to be a member of the 
Ig superfamily; Robo's ectodomain contains five immunoglobulin (Ig)-like repeats followed 
by three fibronectin (Fn) type-Ill repeats. The predicted ORF also contains a transmembrane 
domain and a large 457 amino acid (a.a.) cytoplasmic domain. Hydropathy analysis of the 
Robo sequence indicates a single membrane spanning domain of 25 a. a. (Kyte and Doolittle, 
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1982) plus a signal sequence with a predicted cleavage site between G51 and Q52 (Nielsen et 
al 1997). 

We identify the 12-1 transcript as encoding robo based on several criteria. First, the 
embryonic robo phenotype can be rescued by the 16 kb Xbal genomic fragment containing 
this cDNA; no other transcripts are contained in this 16 kb Xbal fragment. Second, we 
identified a Cfol RPLP associated with the allele robo 6 . This polymorphism is due to a 
change of nucleotide 332 of the ORF from G to A, which results in a change of Gly m to Asp. 
Glyl 1 1 is in the first Ig domain (Figure 2), and is conserved in all Robo homologues 
identified. The change is specific to the allele robo 6 and is not seen in the parental 
chromosome or in any of the other seven alleles, all of which were generated from the same 
parental genotype. Third, the production of antibodies (below) which recognize the Robo 
protein reveals that the alleles robo 1 , robo 2 , robo 3 , robo 4 and robo 5 do not produce Robo 
protein (Table 12). 

Table 12. robo Mutant Alleles 


Allele Synonym Class 

robo 1 GA285 Protein null 

robo 2 GA1112 Protein null 

robo 3 Z14 Protein null 

robo 4 Z570 Protein null 

robo 5 Z 1 772 Protein null 

robo 6 Z 1 757 Protein positive; Gly } j 2 to Asp 

robo 7 Z2 1 30 Reduced protein levels 

robo 8 Z3127 Protein positive 


All alleles were generated by EMS mutagenesis of FasIII null chromosomes. Each of these 
alleles appear to represent a complete, or near complete, loss-of-function phenotype for robo, 
since the mutant phenotype observed when these alleles are placed over a chromosome 
deficient for the robo locus [Df(2R) X58-5] is indistinguishable from the homozygous allele. 

Finally, transgenic neural expression of robo rescues the midline crossing phenotype of robo 
mutants (see below). 

Developmental Northern blot analysis using both cDNA and genomic probes suggests 
that robo is encoded by a single transcript of -7500 bp. We sequenced genomic DNA and 
identified 17 introns within the sequence of which 14 are only 50-75 bp in length plus three 
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introns of 843 bp, 236 bp, and 1 10 bp (Figure IB). The precise start point of the transcript has 

not been determined. 

A Family of Evolutionarily Conserved Robo-like Proteins. The presence of five Ig 
and three Fn domains, a transmembrane domain, and a long (452 a.a.) cytoplasmic region 
indicates that Robo may be a receptor and signaling molecule. The netrin receptor 
DCC/Frazzled/UNC-40 has a related domain structure, with 6 Ig and 4 Fn domains and a 
similarly long cytoplasmic region (Keino-Masu et al., 1996; Chan et al., 1996; Kolodziej et 
al., 1996). The only currently known protein with a "5 + 3" organization is CDO (Kang et al., 
1997). However, CDO is only distantly related to Robo (15-33% a.a. identity between 
corresponding Ig and FN domains). 

We identified other "5 + 3" proteins in vertebrates whose amino acid identity exceeds 
that of CDO and represent Robo homologues. A human expressed sequence tag (EST; 
yu23dl 1, Accession #H77734) shows high homology to the second Ig domain of robo and 
was used to probe a human fetal brain cDNA library (Stratagene). The clones recovered 
correspond to a human gene with five Ig and three Fn domains (Figure 2). Exemplary 
functional Robo domains are listed in Tables 13-17 (the corresponding encoding nucleic acids 
are readily discernable from the corresponding nucleic acid sequences of Sequence Listing). 


Table 13. Exemplary domains of human Robo 1, by amino acid sequence positions 

Signal sequence: 

6-21 

First Immunoglobulin domain: 

68-167 

Second Immunoglobulin domain: 

168-258 

Third Immunoglobulin domain: 

259-350 

Fourth Immunoglobulin domain: 

351-450 

Fifth Immunoglobulin domain: 

451-546 

First Fibronectin domain: 

547-644 

Second Fibronectin domain: 

645-761 

Third Fibronectin domain: 

762-862 

Transmembrane domain: 

896-917 

Cytoplasmic motif #1 : 

1070-1079 

Cytoplasmic motif #2: 

1181-1195 

Cytoplasmic motif #3: 

1481-1488 
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Table 14, Exemplary domains of human Robo II, by amino acid sequence positions 

Fourth Immunoglobulin domain: 

1-91 

Fifth Immunoglobulin domain: 

92-185 

First Fibronectin domain: 

186-282 

Table 15. Exemplary domains of drosophila Robo 1 ? by amino acid sequence positions 

Signal sequence: 

30-46 

First Immunoglobulin domain: 

56-152 

Second Immunoglobulin domain: 

153-251 

Third Immunoglobulin domain: 

252-344 

Fourth Immunoglobulin domain: 

O A C A A A 

J45-44U 

Fifth Immunoglobulin domain: 

441-535 

First Fibronectin domain: 

536-635 

Second Fibronectin domain: 

636-753 

Third Fibronectin domain: 

754-854 

Transmembrane domain: 

915-938 

Cytoplasmic motif #1 : 

1037-1046 

Cytoplasmic motif #2: 

1098-1119 

Cytoplasmic motif #3: 

1262-1269 

Table 16. Exemplary domains of drosophila Robo II, by amino acid sequence positions 

Immunoglobulin domain #1: 

4-99 

Immunoglobulin domain #2: 

100-192 

Immunoglobulin domain #3: 

193-296 

Immunoglobulin domain #4: 

297-396 

Immunoglobulin domain #5: 

397-494 

Fibronectin domain #1 : 

495-595 

Fibronectin domain #2: 

596-770 

Fibronectin domain #3: 

771-877 

Transmembrane domain: 

906-929 

Conserved cytoplasmic motif #1 : 

1075-1084 
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Table 17. Exemplary domains of C. elegans 

Robo 1 , by amino acid sequence positions 

First Immunoglobulin domain: 

30-129 

Second Immunoglobulin domain: 

130-223 

Third Immunoglobulin domain: 

224-315 

Fourth Immunoglobulin domain: 

316-453 

Fifth Immunoglobulin domain: 

454-543 

First Fibronectin domain: 

544-643 

Second Fibronectin domain: 

644-766 

Third Fibronectin domain: 

767-865 

Transmembrane domain: 

900-922 

Cytoplasmic motif #1 : 

1036-1045 

Cytoplasmic motif #2: 

1153-1163 

Cytoplasmic motif #3: 

1065-1074 


The homology is particularly high in the first two Ig domains (58% and 48% a.a. identity 
respectively, compared to 26% and 30% for the same two Ig domains between D-Robol and 
CDO) and together with the overall identity throughout the extracellular region and the 
presence of three conserved cytoplasmic motifs has led us to designate this as the human 
roundabout 1 gene (H-robol). Database searching reveals a nucleotide sequence 
corresponding to H-robol in the database, DUTT1, which differs in the signal sequence 
suggesting alternative splicing, a 9 bp insertion and seven single base pair changes. Five 
ESTs (see Experimental Procedures) show high sequence similarity to the cytoplasmic domain 
of H-roboL Sequencing of cDNAs isolated using one of these ESTs as a probe confirmed a 
second human roundabout gene (H-robo2). 

Degenerate PCR primers based on conserved sequences between H-robol and D- 
robol were used to isolate a PCR fragment from a rat embryonic El 3 brain cDNA library. 
The fragment was used to probe an El 3 spinal cord cDNA library, resulting in the isolation of 
a full length Rat robo gene (R-robol). The predicted protein shows high sequence identitiy 
(>95%) with H-robol over the entire length. The 5 1 sequences of different R-robol cDNA 
clones indicates that this gene is alternatively spliced in a similar fashion to H-robol I DUTTI. 
We used a similar approach to isolate cDNA clones for R~robo2 7 which is highly homologous 
to H-robo2. 
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The mouse EST vi92e02 is highly homologous to the cytoplasmic portion of H-roboL 
The C. elegans Sax-3 gene is also a robo homologue (Table 1; Zallen et al., 1997). A second 
Drosophila robo gene (D~robo2) is also predicted from analysis of genomic sequence in the 
public database. Taken together these data indicate that Robo is the founding member of a 
new subfamily of Ig superfamily proteins with at least one member in nematode, two in 
Drosophila, two in rat, and two in human. 

The alignment of the Robo family proteins reveals that the first and second Ig domains 
are the most highly conserved portion of the extracellular domain. The cytoplasmic domains 
are highly divergent except for the presence of three highly conserved motifs (Table 18). 

Table 18. Conserved Cytoplasmic Motifs: Amino acid alignments of the three conserved 
cytoplasmic motifs are shown below the structure; in C. elegans robo, motifs #2 and #3 have 
been switched to provide a better alignment. 


Conserved Cytoplasmic Motif #1 
PDNPTPYATTMIIGTSS 1050 Drosophila roundabout-I 
SGQPTPYATTQLI QSNL 10 83 Human roundabout-I 
NAS PAP YAT SSI LS PHQ 1088 Drosophila roundabout- II 
HDDPSPYATTTLVLSNQ 1049 C. elegans roundabout 

PtPYATT.hh. . . . Consensus (where h is I, L or V) 


Conserved Cytoplasmic Motif #2 
INWSE. FLPPPPEHPPPSSTYG.Y 1119 Drosophila roundabout-I 
MNWAD . LLPPPPAHPPPHSNSEEY 12 02 Human roundabout-I 
STWA3STVPLPPPPVQPLPGTELEHY 31 Human roundabout - 1 1 
KTLMD.FIPPPPSNPPPP.GGHVY 1168 C. elegans roundabout-I 
nW...hhPPPP. PPP.S....Y Consensus (where h is hydrophobic) 


Conserved Cytoplasmic Motif #3 
PSPMQPPPPVPVPEGW . Y 1273 
YTDDLPPPPVPPPAIKSP 14 93 
YADDLPPPPVPPPAIKSP 90 


Drosophila roundabout-I 
Human roundabout-I 
Mouse roundabout - 1 
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RAPAMPTNPVP PE P PARY 1077 C.elegans roundabout 
PPPPVPPP .... Consensus 

The consensus for the first motif is PtPYATTxhh, where x is any amino acid and h is I, L, or 
V. The presence of a tyrosine in the center of the motif indicates a site for phosphorylation. 
The other two motifs consist of runs of prolines separated by one or two amino acids and are 
reminiscent of binding sites for SH3 domains. In particular, the LPPP sequence in motif #2 
provides a good binding site for the Drosophila Enabled protein or its mammalian homologue 
Mena (Niebuhr et ah, 1997). All three of these conserved sites can function as binding sites 
for domains (e.g. SH3 domains) of linker/adapter proteins functioning in Robo-mediated 
signal transduction. 

Robo is Regionally Expressed on Longitudinal Axons in the Drosophila Embryo. In 
order to determine the role that robo might play in regulating axon crossing behavior, we 
examined the robo expression pattern in the embryonic CNS. The in situ hybridization 
pattern of robo mRNA in Drosophila shows it to have elevated and widespread expression in 
the CNS. We raised a monoclonal antibody (MAb 13C9) against part of the extracellular 
portion (amino acids 404-725) of the protein to visualize Robo expression. Robo is first seen 
in the embryo weakly expressed in lateral stripes during germband extension. At the onset of 
germband retraction, Robo expression is observed in the neuroectoderm. By the end of stage 
12, as the growth cones first extend, Robo is seen on growth cones which project ipsilaterally, 
including pCC, aCC, MP1, dMP2, and vMP2. Strikingly, little or no Robo expression is 
observed on commissural growth cones as they extend towards and across the midline. 
However, as these growth cones turn to project longitudinally, their level of Robo expression 
dramatically increases. Robo is expressed at high levels on all longitudinally-projecting 
growth cones and axons. In contrast, Robo is expressed at nearly undetectable levels on 
commissural axons. This is striking since -90% of axons in the longitudinal tracts also have 
axon segments crossing in one of the commissures. Thus, Robo expression is regionally 
restricted. Robo expression is also seen at a low level throughout the epidermis and at a 
higher level at muscle attachment sites. In stage 16-17 embryos, faint Robo staining can be 
seen in the commissures but at levels much lower than observed in the longitudinal tracts. 

Immunoelectron Microscopy of Robo. We used immunoelectron microscopy to 
examine Robo localization at higher resolution. In stage 13 embryos, Robo is expressed at 
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higher levels on growth cones and filopodia in the longitudinal tracts than on the longitudinal 
axons themselves. This localization is consistent with the model that Robo functions as a 
guidance receptor. The increased sensitivity of immunoelectron microscopy reveals the 
presence of very low levels of Robo protein on the surface of commissural axons. In addition, 
Robo-positive vesicles can be seen inside the commissural axons, possibly representing 
transport of Robo to the growth cone. Finally, by reconstructing the path of single axons by 
use of serial sections, we confirm that Robo expression is greatly up-regulated after individual 
axons turn from the commissure into a longitudinal tract. The expression of Robo on non- 
crossing and post-crossing axons and its higher level of expression on growth cones and its 
filopodia, provide a model where Robo functions as an axon guidance receptor for a repulsive 
midline cue. 

Transgenic Expression of Robo. We hypothesized that if Robo is indeed a growth 
cone receptor for a midline repellent, then pan-neural expression of Robo protein during the 
early stages of axon outgrowth might lead to a robo gain-of-function phenotype similar to the 
comm loss-of-function and opposite of the robo loss-of-function. To test this hypothesis, we 
cloned a robo cDNA containing the complete ORF but lacking most of its untranslated 
regions (UTRs) downstream of the UAS promoter in the pUAST vector and generated 
transgenic flies for use in the GAL4 system (Brand and Perrimon, 1993). Expression of robo 
in all neurons was achieved by crossing the UAS-robo flies to either the elav-GAL4 or 
scabrous-GAL4 lines. 

Surprisingly, pan-neural expression of robo mRNA did not produce a strong axon 
scaffold phenotype as assayed with MAb BP 102. Staining with anti-Fas II (MAb 1D4) 
revealed subtle fasciculation defects, but overall the axon scaffold looked quite normal. An 
insight into why we failed to observe a stronger robo ectopic expression phenotype was 
provided by staining these embryos with the anti-Robo MAb. Interestingly, the Robo protein, 
although expressed at higher levels than in wild type, remains restricted as in wild type, i.e., 
high levels of expression on the longitudinal portions of axons and very low levels on the 
commissures. This result indicates that there must be strong regulation of Robo expression, 
probably post-translational, that assures its localization to longitudinal axon segments. Such a 
mechanism could operate by the regulation of protein translation, transport, insertion, 
internalization and/or stability. 

We used these transgenic flies to rescue robo mutants. Expression of robo by the elav- 
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GAL4 line in both robo 3 and robo 5 homozygotes rescued the midline crossing of Fas II 
positive axons including pCC and other identified neurons. 

Robo Appears to Function in a Cell Autonomous Fashion. To test whether Robo can 
function in a cell autonomous fashion, we used the UAS-robo transgene with the ftz ng GAL4 
line (Lin et al., 1994) . The ftz ng -GAL4 line expresses in a subset of CNS neurons, including 
many of the earliest neurons to be affected by the robo mutation such as pCC, vMP2, dMP2, 
and MPL Expression of robo by the ftz ng GAL4 line is sufficient to rescue these identified 
neurons in the robo mutant: pCC, which in robo mutants heads towards and crosses the 
midline, in these rescued embryos now projects ipsilaterally and does not cross the midline. 
When the same embryos were stained with the anti-robo MAb 13C9, we observed that all 
Robo-positive axons did not cross the midline. The ftz ng -GAL4 line drives expression in many 
of the axons in the pCC pathway (Lin et al., 1994), a medial longitudinal fascicle. In robo 
mutants, this axon fascicle freely crosses and circles the midline, joining with its contralateral 
pathway. When rescued by the ftz ng -GAL4 line driving UAS-robo, this pathway now largely 
remains on its own side of the midline, even though occasionally a few axons cross the 
midline. These experiments support the notion that Robo can function in a cell autonomous 
fashion. 

Expression of Mammalian robol in the Rat Spinal Cord. The isolation of several 
vertebrate Robo homologues suggests that Robo may play a similar role in orchestrating 
midline crossing in the vertebrate nervous system as it does in Drosophila. In the vertebrate 
spinal cord, the ventral midline is comprised of a unique group of cells called the floor plate 
(for review, Colamarino and Tessier-Lavigne, 1995). As in the Drosophila nervous system, 
the vertebrate spinal cord contains both crossing and non-crossing axons. Spinal commissural 
neurons are born in the dorsal half of the spinal cord; commissural axons project to and cross 
the floor plate before turning longitudinally in a rostral direction. In contrast, the axons of two 
other classes of neurons, dorsal association neurons and ventral motor neurons, do not cross 
the floor plate (Altman and Bayer, 1984). 

To address the possibility that Robo may play a role in organizing the projections of 
these spinal neurons, we examined the expression of rat robol by RNA in situ hybridization. 
A rat robol riboprobe spanning the first three Ig domains was hybridized to transverse 
sections of El 3 rat spinal cord. At El 3, when many commissural axons will have already 
extended across the floor plate (Altman and Bayer, 1984), rat robol is expressed at high levels 
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in the dorsal spinal cord, in a pattern corresponding to the cell bodies of commissural neurons. 
Rat robol is also expressed at lower levels in a subpopulation of ventral cells in the region of 
the developing motor column. Interestingly, this expression pattern is similar to and overlaps 
partly with the mRNA encoding DCC, another Ig superfamily member which is also 
expressed on commissural and motor neurons and encodes a receptor for Netrin-1 (Keino- 
Masu et al, 1996). Rat robol is not, however, expressed in the either the floor plate or the 
roof plate of the spinal cord or in the dorsal root ganglia. This is in contrast to rat cdo, which 
is strongly expressed in the roof plate (KB, MT-L, and R. Krauss. In the periphery, rat robol 
is also found to be expressed in the the myotome and developing limb, in a pattern 
reminiscent of c-met (Ebens et al, 1996), indicating that rat robol may also be. expressed by 
migrating muscle precursor cells. Therefore, like its Drosophila homologue, rat robol RNA 
is expressed by both crossing and non-crossing populations of axons, indicating that it 
encodes the functional equivalent of D-Robol. 

Genetic Stocks. All eight independent robo alleles were isolated on chromosomes 
deficient for Fasciclin HI as described in Seeger et al., 1993. Subsequent use of a duplication 
that includes FasIII, and recombination of the robo chromosomes, indicates that the robo 
phenotype is independent of the absence of FasIIL Deficiencies were obtained from the 
Drosophila stock center at Bloomington, Indiana. 

Cloning and Molecular Analysis of the robo Genes. Start points for a molecular walk 
to robo were obtained from the Berkeley and Crete Drosophila Genome Projects. 
Chromosomal walking was performed using standard techniques to isolate cosmids from the 
Tamkun library (Tamkun et al., 1992). cDNAs were isolated from the Zinn 9-12 hour 
Drosophila embryo gtll library (Zinn et al., 1988), and from a human fetal brain library 
(Stratagene). Northern blot of poly- A + RNA and reverse Northern blots were hybridized using 
sensitive Church conditions. 

Sequencing of the cDNAs and genomic subclones was performed by the 
dideoxynucleotide chain termination method using Sequenase (USB) following the 
manufacturer's protocol and with the AutoRead kit or AutoCycle kit (Pharmacia) or by 33 P 
cycle sequencing. Reactions were analyzed on a Pharmacia LKB or ABI automated laser 
fluorescent DNA sequencers respectively. The cDNAs were sequenced completely on both 
strands. Sequence contigs were compiled using Lasergene, Intelligenetics, and 
AssemblyLIGN software (Kodak Eastman). Database searches were performed using BLAST 
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(Altschuel et al., 1990). 

A full length D-robol cDNA was generated by ligating two partial cDNAs at an 
internal Hpal site and subcloning into the EcoRI site of pBluescriptSK+. A full length H- 
robol cDNA was synthesized by ligating an Xbal-Sall fragment from a cDNA and a PCR 
product coding for the carboxy-terminal 222 amino acids at a Sail site. The PCR product has 
an EcoRI site introduced at the stop codon. The ligation product was cloned into 
pBluescript.SK+ digested with Xbal and EcoRI. 

To clone the rat robol cDNA, degenerate oligonucleotide primers designed against 
sequences conserved between the 5' ends of D-Robol and H-Robol were used to amplify a 
500 bp fragment from an E13 rat brain cDNA by PCR. This fragment was used to screen an 
E13 spinal cord library at high stringency, resulting in the isolation of a 4.2 kb cDNA clone 
comprising all but the last 700 nucleotides. Subsequent screenings of the library with non- 
overlapping probes from this cDNA led to the isolation of 4 partial and 7 full length clones. 
To clone the rat robo2 cDNA, we screened the same library with a fragment of the H-robo2 
cDNA. 

Expressed Sequence Tag and Genomic Sequences. The ESTs yu23dl 1 (#H77734) ? 
zr54gl2 (#AA236414) and yq76el2 (#H52936, #H52937) code for portions of H-RoboL The 
EST yq7el2 is aberrantly spliced to part of the human glycophorinB gene. Five ESTs 
yn50a07, yg02b06, ygl7b06, ynl3a04 and yml7gl 1 code for part of H-robo2. The 
Drosophila PI clone DS00329 encodes the genomic sequence of D-robo2. Sequences 
1825710 and 182571 1 (both: #U88183; locus ZK377) code for the predicted sequence of C. 
elegans robo. The EST vi62e02 (#AA499193) codes for mouse roboL 

Identification of Molecular Defects In robo Alleles. Southern blots of robo alleles and 
their parental chromosomes were hybridized with fragments from the genomic cosmid clone 
106-1435 or partial cDNA clones to identify restriction fragment length polymorphisms 
affecting the robo transcription unit. DNA was obtained from homozygous mutant embryos. 
35 cycles of the PCR was subsequently performed on the DNA obtained from half an embryo. 
Primers specific for the region flanking the Cfol polymorphism used were : ROB06 (5 f - 
GCATTGGGTCATCTGTAGAG -3 1 ) and ROB023 (5'-AGCTATCTGGAGGGAGGCAT- 
3'). The PCR products were purified on a Pharmacia H300 spin column and sequenced 
directly. 

Transformation of Drosophila, robo Rescue, and Overexpression. The 16 kb Xbal 
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fragment from cosmid 106-1435 was cloned into the Drosophila transformation vector 
pCaSpeR3, Transformant lines were generated and mapped by standard procedures. Four 
independent lines were shown to rescue robo 135 alleles as judged by MAb 1D4 staining. 

PCR amplification of the D-robo ORF using the primers (5'- 
GAGTGGTGAATTCAACAGCACCAAAACCACAAAATGCATCCC-3') and (5'- 
CGGGGAGTCTAGAACACTTCATCCTTAGGTG-3 1 ) produced a PCR product with an 
altered ribosome binding site that more closely matches the Drosophila consensus (Cavener, 
1987), and has only 21bp of 5' UTR and no 3' UTR sequences. The PCR product was 
digested with EcoRI and Xbal and cloned into pBluescript (Stratagene) and subsequently, 
pUAST (Brand and Perrimon 1993). Transformant lines were crossed to elav-GAL4 and sea- 
GAL4 lines which express GAL4 in all neurons, or ftzng-GAL4 which expresses in a subset of 
CNS neurons (Lin et al, 1994). Embryos were assayed by staining with MAbs BP 102, 1D4 
and 13C9. For ectopic expression in the robo mutant background, the stocks robo 3 and robo 5 
(both protein nulls) were used. Crosses utilized the stocks w; robo/CyO; UAS-robo and w; 
robo/CyO; elav-GAL4. Due to the difficulty of maintaining a balanced stock, robo/+;ftz- 
ngGAL4/+ males were generated as required. 

Generation of Fusion Proteins and Antibodies. A six histidine tagged fusion protein 
was constructed by cloning amino acids 404-725 of the D-robo protein into the PstI site of the 
pQE31 vector (Qiagen). Fusion proteins were purified under denaturing conditions and 
subsequently dialyzed against PBS. Immunization of mice and MAb production followed 
standard protocols (Patel, 1994). 

RNA Localization and Protein Immunocytochemistry. Digoxigenin labeled antisense 
robo transcripts were generated from a subclone of a robo cDNA in Bluescript. In-situ tissue 
hybridization was performed as described in Tear et al., 1996. Immunocytochemistry was 
performed as described by Patel, 1994. MAb 1D4 was used at a dilution of 1:5 and BP102 at 
1:10. For anti-robo staining, MAb 13C9 was diluted 1:10 in PBS with 0.1% Tween-20, and 
the embryos were fixed and cracked so as to minimize exposure to methanol. The presence of 
triton and storage of embryos in methanol were both found to destroy the activity of MAb 
13C9. 

In situ hybridization of rat spinal cords was carried out essentially as described in Fan 
and Tessier-Lavigne, 1994. El 3 embryos were fixed in 4% paraformaldehyde, processed, 
embedded in OCT, and sectioned to 10 m. A l.Okb 35 S antisense rRobo riboprobe spannning 
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the the first three immunoglobulin domains was used for hybridization. An additional non- 
overlapping probe was also used with identical results. DCC transcripts were detected as 
described in Keino-Masu et al., 1996. Irnmunohistochemistry against TAG-1 was carried out 
on 10 m transverse spinal cord sections using 4D7 monoclonal antibody (Dodd et al, 1988). 

Electron Microscopy. Canton S embryos were hand devitellinized, opened dorsally to 
remove the gut, and prepared for immunoelectron microscopy according to the procedures 
described previously (Lin et al., 1994), with the following modifications. The fixed embryos 
were incubated sequentially with MAb 13C9 (1:1) for 1-2 hours, biotinylated goat anti-mouse 
secondary antibody (1:250) for 1.5 hours, and then streptavidin-conjugated HRP (1:200) for 
1.5 hours. Hydrogen peroxide (0.01%) was used instead of glucose oxidase for the HRP-DAB 
reaction. 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 


ATGCATCCCA 

TGCATCCCGA 

AAACCACGCC 

ATCGCCCGGA 

GCACGAGCAC 

CACTAATAAC 

60 

CCATCTCGCA 

GTCGGAGCAG 

CAGGATGTGG 

CTCCTGCCCG 

CCTGGCTGCT 

CCTCGTCCTG 

120 

GTGGCCAGCA 

ATGGCCTGCC 

AGCAGTCAGA 

GGCCAGTACC 

AATCGCCACG 

TATCATCGAG 

180 

CATCCCACGG 

ATCTGGTCGT 

TAAGAAGAAT 

GAACCCGCCA 

CGCTCAACTG 

CAAAGTGGAG 

240 

GGCAAGCCGG 

AACCCACCAT 

TGAGTGGTTT 

AAGGATGGCG 

AACCCGTCAG 

CACCAACGAA 

300 

AAGAAATCGC 

ACCGCGTCCA 

GTTCAAGGAC 

GGCGCCCTCT 

TCTTTTACAG 

GACAATGCAA 

360 

GGCAAGAAGG 

AGCAGGACGG 

CGGAGAGTAC 

TGGTGCGTGG 

CCAAGAACCG 

AGTGGGCCAG 

420 

GCCGTTAGTC 

GCCATGCCTC 

CCTCCAGATA 

GCTGTTTTGC 

GCGACGATTT 

TCGCGTGGAG 

480 

CCCAAAGACA 

CGCGAGTGGC 

CAAAGGCGAG 

ACGGCTCTGC 

TGGAGTGTGG 

GCCGCCCAAA 

540 

GGCATTCCAG 

AGCCAACGCT 

GATTTGGATA 

AAGGACGGCG 

TTCCCTTGGA 

CGACCTGAAA 

600 

GCCATGTCGT 

TTGGCGCCAG 

CTCCCGCGTT 

CGAATTGTGG 

ACGGTGGCAA 

CCTGCTGATC 

660 

AGCAATGTGG 

AGCCCATTGA 

TGAGGGCAAC 

TACAAGTGCA 

TTGCCCAGAA 

TCTGGTAGGC 

720 

ACCCGCGAGA 

GCAGCTATGC 

CAAGCTGATT 

GTCCAGGTCA 

AACCATACTT 

TATGAAGGAG 

780 

CCCAAGGATC 

AGGTGATGCT 

CTACGGCCAG 

ACAGCCACTT 

TCCACTGCTC 

AGTGGGCGGT 

840 

GATCCGCCGC 

CGAAAGTGTT 

GTGGAAAAAG 

GAGGAGGGCA 

ATATTCCGGT 

GTCCAGAGCG 

900 

CGAATCCTTC 

ACGACGAGAA 

AAGTTTAGAG 

ATATC CAAC A 

TAACGCCCAC 

CGATGAGGGC 

960 

ACCTATGTCT 

GCGAGGCACA 

CAACAATGTC 

GGTCAGATCA 

GCGCTAGGGC 

TTCTCTTATA 

1020 

GTCCACGCTC 

CGCCGAACTT 

TACGAAAAGA 

CCCAGTAACA 

AGAAAGTGGG 

ACTAAATGGG 

1080 

GTTGTCCAAC 

TACCTTGCAT 

GGCCTCCGGA 

AACCCTCCGC 

CGTCTGTATT 

CTGGAC CAAG 

1140 

GAAGGAGTAT 

CCACTCTTAT 

GTTCCCAAAT 

AGTTCGCACG 

GAAGGCAGTA 

TGTGGCTGCC 

1200 

GATGGAACTC 

TGCAGATTAC 

GGATGTGCGG 

CAGGAAGACG 

AAGGCTACTA 

TGTGTGTTCC 

1260 

GCTTTCAGTG 

TAGTCGATTC 

CTCTACAGTA 

CGGGTTTTCC 

TGCAAGTCAG 

CTCGGTAGAC 

1320 

GAGCGTCCAC 

CTCCGATTAT 

TCAAATCGGA 

CCTGCCAATC 

AAACACTGCC 

CAAGGGATCA 

1380 

GTTGCTACTT 

TACCCTGTCG 

GGCCACTGGA 

AATCCCAGTC 

CCCGTATCAA 

GTGGTTCCAC 

1440 

GATGGACATG 

CCGTACAAGC 

GGGCAATCGA 

TACAGC AT C A 

TCCAAGGAAG 

CTCACTGAGA 

1500 

GTCGATGACC 

TTCAACTAAG 

TGACTCTGGT 

ACCTACACCT 

GCACTGCATC 

TGGCGAACGA 

1560 

GGAGAAACTT 

CCTGGGCTGC 

CACACTAACG 

GTGGAAAAAC 

CCGGTTCTAC 

ATCTCTTCAC 

1620 

CGGGCAGCTG 

ATCCTAGCAC 

TTATCCTGCT 

CCTCCAGGAA 

CACCTAAAGT 

CCTGAATGTC 

1680 

AGTCGCACCA 

GCATTAGTCT 

TCGTTGGGCT 

AAAAGCCAAG 

AGAAACCCGG 

AGCTGTGGGC 

1740 

CCAATCATTG 

GATACACTGT 

AGAGTACTTC 

AGTCCGGATC 

TGCAAACTGG 

TTGGATTGTG 

1800 

GCTGCCCATC 

GAGTCGGCGA 

CACTCAAGTC 

ACTATCTCGG 

GTCTCACTCC 

TGGCACTTCG 

1860 

TATGTGTTCC 

TAGTTAGAGC 

TGAGAATACT 

CAGGGTATTT 

CTGTGCCTTC 

CGGCTTATCA 

1920 

AATGTTATTA 

AAACCATTGA 

GGCAGATTTC 

GATGCAGCTT 

CTGCCAATGA 

TTTGTCAGCA 

1980 

GCTCGAACTT 

TGCTGACAGG 

AAAGTCGGTG 

GAGCTAATAG 

ATGCCTCGGC 

TATCAATGCT 

2040 

AGTGCCGTTA 

GACTTGAGTG 

GATGCTCCAC 

GTGAGCGCTG 

ATGAGAAATA 

CGTAGAGGGC 

2100 
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CTGCGCATAC 

ACTATAAGGA 

TGCCAGTGTA 

CCATCCGCAC 

AGTATCACTC 

GATCACTGTT 

2160 

ATGGATGCCT 

CTGCAGAATC 

GTTTGTGGTG 

GGAAACCTTA 

AGAAGTACAC 

CAAGTATGAG 

2220 

TTCTTCCTAA 

CACCCTTTTT 

TGAGACAATT 

GAAGGACAGC 

CCAGTAACTC 

CAAGACAGCC 

2280 

CTCACCTATG 

AAGATGTTCC 

CTCCGCACCA 

CCGGATAACA 

TTCAGATTGG 

CATGTACAAC 

2340 

CAAACAGCCG 

GTTGGGTGCG 

TTGGACTCCG 

CCACCCTCCC 

AGCACCACAA 

TGGCAATTTG 

2400 

TATGGCTACA 

AGATTGAGGT 

CAGCGCCGGT 

AACACCATGA 

AGGTGCTGGC 

CAATATGACT 

2460 

CTTAATGCTA 

CCACCACATC 

TGTGCTCCTA 

AATAAC CTAA 

CCACCGGAGC 

TGTGTACAGC 

2520 

GTGAGGTTGA 

ACTCCTTTAC 

CAAGGCAGGA 

GATGGACCTT 

ACTCCAAACC 

GAT AT C ACT A 

2580 

TTCATGGACC 

CCACCCATCA 

TGTGCATCCG 

CCACGGGCAC 

ATCCAAGCGG 

CACCCATGAT 

2640 

GGGCGACATG 

AGGGACAGGA 

TCTCACGTAT 

CATAACAATG 

GCAACATACC 

ACCTGGCGAC 

2700 

ATTAATCCCA 

CCACTCATAA 

AAAGAC C ACT 

GACTAC CTAT 

CTGGACCGTG 

GCTAATGGTG 

2760 

CTGGTCTGCA 

TCGTTCTTCT 

AGTCCTGGTT 

ATTTCGGCGG 

CTATTTCGAT 

GGTCTACTTC 

2820 

AAGCGCAAGC 

ATCAAATGAC 

CAAGGAATTG 

GGTCACTTAA 

GTGTGGTCAG 

TGACAACGAA 

2880 

ATAACCGCAT 

TAAATATCAA 

TAGCAAAGAG 

AGCCTTTGGA 

TAGAC CAT C A 

TCGTGGATGG 

2940 

CGAACTGCCG 

ATACTGACAA 

AGACTCAGGA 

TTAAGCGAAT 

CGAAGCTACT 

ATCCCACGTT 

3000 

AACAGC AG TC 

AATCCAACTA 

CAATAACTCC 

GATGGAGGAA 

CCGATTATGC 

AGAAGTTGAC 

3060 

ACCCGTAACC 

TTACCACCTT 

CTACAATTGT 

CGCAAGAGCC 

CCGATAATCC 

CACGCCGTAC 

3120 

GCCACCACTA 

TGATCATTGG 

TACCTCTTCC 

AGTGAGACCT 

GCACCAAGAC 

AACATC TATA 

3180 

AGTGCCGATA 

AGGACTCGGG 

AACTCATTCG 

CCCTATTCTG 

ACGCATTTGC 

CGGTCAGGTG 

3240 

CCAGCGGTTC 

CTGTTGTCAA 

ATCCAACTAT 

CTTCAGTATC 

CGGTTGAACC 

GATCAACTGG 

3300 

TCAGAGTTTC 

TACCCCCGCC 

GCCAGAACAC 

CCACCTCCGT 

CTTCTACCTA 

TGGATACGCA 

3360 

CAAGGATCTC 

CTGAATCTTC 

GCGGAAGAGC 

TCCAAAAGCG 

CAGGTTCCGG 

CATTTCTACA 

3420 

AATCAAAGCA 

TTCTGAACGC 

ATCCATACAC 

AGCAGCTCCT 

CGGGCGGCTT 

TTCAGCTTGG 

3480 

GGAGTATCGC 

CCCAATATGC 

TGTCGCCTGT 

CCACCGGAAA 

ACGTTTATAG 

CAATCCGCTG 

3540 

TCGGCAGTGG 

CTGGCGGCAC 

CCAGAACCGC 

TATCAGATAA 

CGCCCACAAA 

CCAACATCCG 

3600 

CCACAGTTAC 

CGGCCTACTT 

TGCCACCACG 

GGTCCAGGAG 

GAGCTGTACC 

ACCCAACCAC 

3660 

CTGCCATTTG 

CCACACAGCG 

TCATGCAGCC 

AGCGAGTACC 

AGGCTGGACT 

GAATGCAGCG 

3720 

CGATGTGCCC 

AAAGCCGCGC 

CTGCAACAGC 

TGCGATGCCT 

TGGCCACACC 

CTCGCCCATG 

3780 

CAACCCCCAC 

CGCCAGTTCC 

CGTACCCGAG 

GGCTGGTACC 

AACCGGTGCA 

TCCCAATAGC 

3840 

CACCCGATGC 

ACCCGACCTC 

CTCCAACCAC 

CAGATCTACC 

AGTGCTCCTC 

CGAGTGCTCG 

3900 

GATCACTCGA 

GGAGCTCGCA 

GAGTCACAAG 

CGGCAGCTGC 

AGCTCGAGGA 

GCACGGCAGC 

3960 

AGTGCCAAAC 

AACGCGGAGG 

ACACCACCGT 

CGACGAGCCC 

CGGTGGTGCA 

GCCGTGCATG 

4020 

GAGAGCGAGA 

ACGAGAACAT 

GCTGGCGGAG 

TACGAGCAGC 

GCCAGTACAC 

CAGCGATTGC 

4080 

TGCAATAGCT 

CCCGCGAGGG 

CGACACCTGC 

TCCTGCAGCG 

AGGGATC CTG 

TCTTTACGCC 

4140 

GAGGCGGGCG 

AGCCGGCGCC 

TCGTCAAATG 

ACTGCTAAGA 

ACACCTAA 


4188 


(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 13 95 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met His Pro Met His Pro Glu Asn His Ala lie Ala Arg Ser Thr Ser 

15 10 15 

Thr Thr Asn Asn Pro Ser Arg Ser Arg Ser Ser Arg Met Trp Leu Leu 

20 25 30 

Pro Ala Trp Leu Leu Leu Val Leu Val Ala Ser Asn Gly Leu Pro Ala 

35 40 45 

Val Arg Gly Gin Tyr Gin Ser Pro Arg He He Glu His Pro Thr Asp 

50 55 60 

Leu Val Val Lys Lys Asn Glu Pro Ala Thr Leu Asn Cys Lys Val Glu 
65 70 75 80 

Gly Lys Pro Glu Pro Thr He Glu Trp Phe Lys Asp Gly Glu Pro Val 

85 90 95 

Ser Thr Asn Glu Lys Lys Ser His Arg Val Gin Phe Lys Asp Gly Ala 

100 105 110 

Leu Phe Phe Tyr Arg Thr Met Gin Gly Lys Lys Glu Gin Asp Gly Gly 

115 120 125 

Glu Tyr Trp Cys Val Ala Lys Asn Arg Val Gly Gin Ala Val Ser Arg 

130 135 140 

His Ala Ser Leu Gin He Ala Val Leu Arg Asp Asp Phe Arg Val Glu 
145 150 155 160 

Pro Lys Asp Thr Arg Val Ala Lys Gly Glu Thr Ala Leu Leu Glu Cys 

165 170 175 

Gly Pro Pro Lys Gly He Pro Glu Pro Thr Leu He Trp He Lys Asp 

180 185 190 

Gly Val Pro Leu Asp Asp Leu Lys Ala Met Ser Phe Gly Ala Ser Ser 

195 200 205 

Arg Val Arg He Val Asp Gly Gly Asn Leu Leu He Ser Asn Val Glu 

210 215 220 

Pro He Asp Glu Gly Asn Tyr Lys Cys He Ala Gin Asn Leu Val Gly 
225 230 235 240 

Thr Arg Glu Ser Ser Tyr Ala Lys Leu He Val Gin Val Lys Pro Tyr 
245 250 255 
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Phe Met Lys Glu Pro 
260 

Thr Phe His Cys Ser 
275 

Lys Lys Glu Glu Gly 
290 

Asp Glu Lys Ser Leu 
305 

Thr Tyr Val Cys Glu 
325 

Ala Ser Leu lie Val 
340 

Asn Lys Lys Val Gly 
355 

Ser Gly Asn Pro Pro 
370 

Thr Leu Met Phe Pro 
385 

Asp Gly Thr Leu Gin 
405 

Tyr Val Cys Ser Ala 
420 

Phe Leu Gin Val Ser 
435 

lie Gly Pro Ala Asn 
450 

Pro Cys Arg Ala Thr 
465 

Asp Gly His Ala Val 
485 

Ser Ser Leu Arg Val 
500 

Thr Cys Thr Ala Ser 
515 

Leu Thr Val Glu Lys 
530 

Pro Ser Thr Tyr Pro 
545 


Lys Asp Gin Val Met 
265 

Val Gly Gly Asp Pro 
280 

Asn lie Pro Val Ser 
295 

Glu lie Ser Asn lie 
310 

Ala His Asn Asn Val 
330 

His Ala Pro Pro Asn 
345 

Leu Asn Gly Val Val 
360 

Pro Ser Val Phe Trp 
375 

Asn Ser Ser His Gly 
390 

lie Thr Asp Val Arg 
410 

Phe Ser Val Val Asp 
425 

Ser Val Asp Glu Arg 
440 

Gin Thr Leu Pro Lys 
455 

Gly Asn Pro Ser Pro 
470 

Gin Ala Gly Asn Arg 
490 

Asp Asp Leu Gin Leu 
505 

Gly Glu Arg Gly Glu 
520 

Pro Gly Ser Thr Ser 
535 

Ala Pro Pro Gly Thr 
550 
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Leu Tyr Gly Gin Thr 
270 

Pro Pro Lys Val Leu 
285 

Arg Ala Arg lie Leu 
300 

Thr Pro Thr Asp Glu 
315 

Gly Gin lie Ser Ala 
335 

Phe Thr Lys Arg Pro 
350 

Gin Leu Pro Cys Met 
365 

Thr Lys Glu Gly Val 
380 

Arg Gin Tyr Val Ala 
395 

Gin Glu Asp Glu Gly 
415 

Ser Ser Thr Val Arg 
430 

Pro Pro Pro lie lie 
445 

Gly Ser Val Ala Thr 
460 

Arg lie Lys Trp Phe 
475 

Tyr Ser lie lie Gin 
495 

Ser Asp Ser Gly Thr 
510 

Thr Ser Trp Ala Ala 
525 

Leu His Arg Ala Ala 
540 

Pro Lys Val Leu Asn 
555 


Ala 

Trp 

His 

Gly 
320 
Arg 

Ser 

Ala 

Ser 

Ala 
400 
Tyr 

Val 

Gin 

Leu 

His 
480 
Gly 

Tyr 

Thr 

Asp 

Val 
560 
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Ser Arg Thr Ser lie 
565 

Gly Ala Val Gly Pro 
580 

Asp Leu Gin Thr Gly 
595 

Gin Val Thr lie Ser 
610 

Val Arg Ala Glu Asn 
625 

Asn Val lie Lys Thr 
645 

Asp Leu Ser Ala Ala 
660 

He Asp Ala Ser Ala 
675 

Leu His Val Ser Ala 
690 

Tyr Lys Asp Ala Ser 
705 

Met Asp Ala Ser Ala 
725 

Thr Lys Tyr Glu Phe 
740 

Gin Pro Ser Asn Ser 
755 

Ala Pro Pro Asp Asn 
770 

Trp Val Arg Trp Thr 
785 

Tyr Gly Tyr Lys He 
805 

Ala Asn Met Thr Leu 
820 

Leu Thr Thr Gly Ala 
835 

Ala Gly Asp Gly Pro 
850 


Ser Leu Arg Trp Ala 
570 

He He Gly Tyr Thr 
585 

Trp He Val Ala Ala 
600 

Gly Leu Thr Pro Gly 
615 

Thr Gin Gly He Ser 
630 

He Glu Ala Asp Phe 
650 

Arg Thr Leu Leu Thr 
665 

lie Asn Ala Ser Ala 
680 

Asp Glu Lys Tyr Val 
695 

Val Pro Ser Ala Gin 
710 

Glu Ser Phe Val Val 
730 

Phe Leu Thr Pro Phe 
745 

Lys Thr Ala Leu Thr 
760 

He Gin He Gly Met 
775 

Pro Pro Pro Ser Gin 
790 

Glu Val Ser Ala Gly 
810 

Asn Ala Thr Thr Thr 
825 

Val Tyr Ser Val Arg 
840 

Tyr Ser Lys Pro He 
855 
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Lys Ser Gin Glu Lys 
575 

Val Glu Tyr Phe Ser 
590 

His Arg Val Gly Asp 
605 

Thr Ser Tyr Val Phe 
620 

Val Pro Ser Gly Leu 
635 

Asp Ala Ala Ser Ala 
655 

Gly Lys Ser Val Glu 
670 

Val Arg Leu Glu Trp 
685 

Glu Gly Leu Arg He 
700 

Tyr His Ser He Thr 
715 

Gly Asn Leu Lys Lys 
735 

Phe Glu Thr He Glu 
750 

Tyr Glu Asp Val Pro 
765 

Tyr Asn Gin Thr Ala 
780 

His His Asn Gly Asn 
795 

Asn Thr Met Lys Val 
815 

Ser Val Leu Leu Asn 
830 

Leu Asn Ser Phe Thr 
845 

Ser Leu Phe Met Asp 
860 


Pro 

Pro 

Thr 

Leu 

Ser 
640 
Asn 

Leu 

Met 

His 

Val 
720 
Tyr 

Gly 

Ser 

Gly 

Leu 
800 
Leu 

Asn 

Lys 

Pro 
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Thr His His Val His Pro Pro Arg Ala His Pro Ser Gly Thr His Asp 
865 870 875 880 

Gly Arg His Glu Gly Gin Asp Leu Thr Tyr His Asn Asn Gly Asn lie 

885 890 895 

Pro Pro Gly Asp lie Asn Pro Thr Thr His Lys Lys Thr Thr Asp Tyr 

900 905 910 

Leu Ser Gly Pro Trp Leu Met Val Leu Val Cys lie Val Leu Leu Val 

915 920 925 

Leu Val lie Ser Ala Ala lie Ser Met Val Tyr Phe Lys Arg Lys His 

930 935 940 

Gin Met Thr Lys Glu Leu Gly His Leu Ser Val Val Ser Asp Asn Glu 
945 950 955 960 

He Thr Ala Leu Asn He Asn Ser Lys Glu Ser Leu Trp He Asp His 

965 970 975 

His Arg Gly Trp Arg Thr Ala Asp Thr Asp Lys Asp Ser Gly Leu Ser 

980 985 990 

Glu Ser Lys Leu Leu Ser His Val Asn Ser Ser Gin Ser Asn Tyr Asn 

995 1000 1005 

Asn Ser Asp Gly Gly Thr Asp Tyr Ala Glu Val Asp Thr Arg Asn Leu 

1010 1015 1020 

Thr Thr Phe Tyr Asn Cys Arg Lys Ser Pro Asp Asn Pro Thr Pro Tyr 
1025 1030 1035 1040 

Ala Thr Thr Met He He Gly Thr Ser Ser Ser Glu Thr Cys Thr Lys 

1045 1050 1055 

Thr Thr Ser He Ser Ala Asp Lys Asp Ser Gly Thr His Ser Pro Tyr 

1060 1065 1070 

Ser Asp Ala Phe Ala Gly Gin Val Pro Ala Val Pro Val Val Lys Ser 

1075 1080 1085 

Asn Tyr Leu Gin Tyr Pro Val Glu Pro He Asn Trp Ser Glu Phe Leu 

1090 1095 1100 

Pro Pro Pro Pro Glu His Pro Pro Pro Ser Ser Thr Tyr Gly Tyr Ala 
1105 1110 1115 1120 

Gin Gly Ser Pro Glu Ser Ser Arg Lys Ser Ser Lys Ser Ala Gly Ser 

1125 1130 1135 

Gly He Ser Thr Asn Gin Ser He Leu Asn Ala Ser He His Ser Ser 

1140 1145 1150 

Ser Ser Gly Gly Phe Ser Ala Trp Gly Val Ser Pro Gin Tyr Ala Val 
1155 1160 1165 
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Ala Cys Pro Pro Glu Asn Val Tyr Ser Asn Pro Leu Ser Ala Val Ala 

1170 1175 1180 

Gly Gly Thr Gin Asn Arg Tyr Gin lie Thr Pro Thr Asn Gin His Pro 
1185 1190 1195 1200 

Pro Gin Leu Pro Ala Tyr Phe Ala Thr Thr Gly Pro Gly Gly Ala Val 

1205 1210 1215 

Pro Pro Asn His Leu Pro Phe Ala Thr Gin Arg His Ala Ala Ser Glu 

1220 1225 1230 

Tyr Gin Ala Gly Leu Asn Ala Ala Arg Cys Ala Gin Ser Arg Ala Cys 

1235 1240 1245 

Asn Ser Cys Asp Ala Leu Ala Thr Pro Ser Pro Met Gin Pro Pro Pro 

1250 1255 1260 

Pro Val Pro Val Pro Glu Gly Trp Tyr Gin Pro Val His Pro Asn Ser 
1265 1270 1275 1280 

His Pro Met His Pro Thr Ser Ser Asn His Gin lie Tyr Gin Cys Ser 

1285 1290 1295 

Ser Glu Cys Ser Asp His Ser Arg Ser Ser Gin Ser His Lys Arg Gin 

1300 1305 1310 

Leu Gin Leu Glu Glu His Gly Ser Ser Ala Lys Gin Arg Gly Gly His 

1315 1320 1325 

His Arg Arg Arg Ala Pro Val Val Gin Pro Cys Met Glu Ser Glu Asn 

1330 1335 1340 

Glu Asn Met Leu Ala Glu Tyr Glu Gin Arg Gin Tyr Thr Ser Asp Cys 
1345 1350 1355 1360 

Cys Asn Ser Ser Arg Glu Gly Asp Thr Cys Ser Cys Ser Glu Gly Ser 

1365 1370 1375 

Cys Leu Tyr Ala Glu Ala Gly Glu Pro Ala Pro Arg Gin Met Thr Ala 
1380 1385 1390 

Lys Asn Thr 
1395 


(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 414 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS; double 

(D) TOPOLOGY: linear 
<ii) MOLECULE TYPE: CDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 


GGTGAAAATC 

CACGCATCAT 

CGAGCATCCC 

ATGGACACGA 

CGGTGCCAAA 

AAATG AT CCA 

60 

TTTACGTTTA 

ATTGCCAGGC 

CGAGGGCAAT 

CCAACACCAA 

CCATTCAATG 

GTTTAAGGAC 

120 

GGTCGCGAAC 

TGAAGACGGA 

TACGGGTTCG 

CATCGCATAA 

TGCTGCCCGC 

CGGGGGTCTA 

180 

TTCTTTCTCA 

AGGTTATCCA 

CTCACGTAGA 

GAGAGCGATG 

CGGGCACTTA 

CTGGTGCGAG 

240 

GCCAAAAACG 

AGTTTGGAGT 

GGCACGGTCC 

AGGAATGCAA 

CGTTGCAAGT 

GGCAGTTCTC 

300 

CGCGACGAAT 

TCCGTTTGGA 

GCCGGCAAAT 

ACCCGCGTGG 

CCCAAGGCGA 

GGTGGCCCTG 

360 

ATGGAATGCG 

GTGCCCCCCG 

AGGATCTCCG 

GAGCCGCAAA 

TCTCGTGGCG 

CAAGAACGGC 

420 

CAGAC CCTGA 

ATCTTGTCGG 

GAACAAGCGG 

ATTCGCATTG 

TCGACGGTGG 

CAATCTGGCC 

480 

ATCCAGGAAG 

CCCGCCAATC 

GGACGACGGA 

CGCTACCAGT 

GTGTGGTCAA 

GAATGTGGTT 

540 

GGCACCCGGG 

AGTCGGCCAC 

CGCTTTTCTT 

AAAGTGCATG 

TACGTCCATT 

CCTCATCCGA 

600 

GGACCCCAGA 

ATCAGACGGC 

GGTGGTGGGC 

AGCTCGGTGG 

TCTTCCAGTG 

CCGCATCGGA 

660 

GGCGATCCCC 

TGCCTGATGT 

CCTGTGGCGA 

CGCACTGCCT 

CCGGCGGCAA 

TATGCCACTG 

720 

CGTAAGTTTT 

CTTGGCTTCA 

TTCAGCTTCA 

GGTCGTGTGC 

ACGTACTTGA 

GGACCGCAGT 

780 

CTGAAGCTGG 

ACGACGTTAC 

TCTGGAGGAC 

ATGGGCGAGT 

ACACTTGCGA 

GGCGGACAAT 

840 

GCGGTGGGCG 

GCATCACGGC 

CACTGGCATC 

CTCACCGTTC 

ACGCTCCCCC 

CAAATTTGTG 

900 

ATACGCCCCA 

AGAATCAGCT 

GGTGGAGATC 

GGTGATGAAG 

TGCTGTTCGA 

GTGCCAAGCG 

960 

AATGGACATC 

CCCGACCAAC 

GCTCTACTGG 

TCGGTGGAGG 

GCAACAGCTC 

CCTGCTGCTC 

1020 

CCCGGCTATC 

GGGATGGCCG 

CATGGAAGTG 

ACCCTGACGC 

CCGAGGGGCG 

CTCGGTGCTC 

1080 

TCGATAGCTC 

GATTTGCCCG 

TGAGGATTCC 

GGAAAGGTGG 

TCACTTGCAA 

CGCCCTGAAC 

1140 

GCCGTGGGCA 

GCGTCAGCAG 

TCGGACTGTG 

GTCAGTGTGG 

ATACGCAATT 

CGAGCTGCCA 

1200 

CCGCCGATTA 

TCGAACAGGG 

GCCCGTGAAT 

CAAACGTTGC 

CCGTTAAATC 

AATTGTGGTT 

1260 

CTGCCATGCC 

GAACTCTGGG 

CACTCCAGTG 

CCACAGGTCT 

CTTGGTACCT 

GGATGGCATA 

1320 

CCCATCGATG 

TGCAGGAGCA 

CGAGCGGCGG 

AATCTTTCGG 

ACGCTGGAGC 

CTTAAC CATT 

1380 

TCGGATCTTC 

AGCGCCACGA 

GGATGAAGGC 

TTGTACACCT 

GCGTGGCCAG 

CAATCGCAAC 

1440 

GGAAAATCCT 

CTTGGAGTGG 

TTACCTTCGT 

CTGGACACCC 

CGACAAATCC 

GAATATCAAG 

1500 

TTCTTCAGAG 

CCCCAGAACT 

TTCCACCTAC 

CCAGGGCCGC 

CAGGAAAACC 

GCAAATGGTG 

1560 

GAGAAGGGCG 

AAAATTCGGT 

GACTCTCAGC 

TGGACGAGGA 

GCAACAAGGT 

GGGCGGCTCC 

1620 

AGTCTGGTGG 

GCTATGTAAT 

CGAGATGTTT 

GGCAAAAACG 

AAACGGATGG 

CTGGGTGGCT 

1680 

GTGGGCACTA 

GGGTGCAAAA 

TAC CACGTTT 

ACCCAAACGG 

GTCTGCTGCC 

GGGTGTGAAT 

1740 

TACTTCTTTC 

TAATTCGAGC 

CGAGAACTCC 

CATGGCTTAT 

CACTGCCCAG 

TCCGATGTCG 

1800 

GAACCCATTA 

CGGTGGGAAC 

GCGCTACTTC 

AATAGTGGTC 

TGGATCTGAG 

CGAGGCTCGT 

1860 

GCCAGTCTGC 

TGTCCGGAGA 

TGTTGTGGAG 

CTGAGCAACG 

CCAGTGTGGT 

GGACTCCACT 

1920 

AGCATGAAAC 

TCACCTGGCA 

GATCATCAAT 

GGCAAATACG 

TCGAGGGCTT 

CTATGT CTAT 

1980 

GCGAGACAGT 

TGCCAAATCC 

AATAGTCAAC 

AATCCGGCGC 

CCGTTACTAG 

CAATACCAAT 

2040 

CCGCTGCTGG 

GCTCTACATC 

CACATCCGCA 

TCCGCATCCG 

CCTCGGCATC 

GGCATTGATT 

2100 

TCGACAAAGC 

CAAATATTGC 

AGCTGCCGGC 

AAACGTGATG 

GGGAGACAAA 

CCAGAGTGGA 

2160 

GGAGGAGCTC 

CGACCCCACT 

GAACACCAAG 

TATCGCATGC 

TAACGATTCT 

CAATGGCGGT 

2220 
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GGCGCCTCAT 

CCTGCACCAT 

CACCGGGCTC 

GTCCAGTACA 

CGCTGTATGA 

ATTTTTCATC 

2280 

GTGCCATTTT 

ACAAATCCGT 

CGAGGGCAAG 

CCGTCGAATT 

CGCGCATCGC 

TCGCACCCTT 

2340 

GAAGATGTTC 

CCTCTGAGGC 

ACCATATGGA 

ATGGAGGCTC 

TGCTGTTGAA 

CTCCTCCGCG 

2400 

GTCTTCCTCA 

AATGGAAGGC 

ACCAGAACTC 

AAGGATCGGC 

ATGGTGTTCT 

CTTGAACTAT 

2460 

CATGTTATAG 

TCCGAGGTAT 

TGACACTGCC 

CACAATTTCT 

CACGCATTTT 

GACAAATGTC 

2520 

ACCATCGATG 

CCGCTTCGCC 

TACTCTGGTT 

TTGGCCAATC 

TCACCGAAGG 

CGTCATGTAC 

2580 

ACCGTGGGCG 

TGGCGGCCGG 

AAATAACGCT 

GGAGTTGGTC 

CTTATTGTGT 

CCCAGCTACT 

2640 

TTGCGTTTGG 

ATCCCATCAC 

AAAGCGACTC 

GATCCGTTCA 

TCAATCAGCG 

GGACCATGTT 

2700 

AACGATGTGC 

TGACGCAGCC 

CTGGTTCATA 

ATACTCCTGG 

GCGCCATCCT 

GGCCGTTCTT 

2760 

ATGCTGTCCT 

TTGGCGCAAT 

GGTCTTTGTG 

AAGCGCAAGC 

ACATGATGAT 

GAAGCAGTCG 

2820 

GCCCTAAATA 

CAATGCGTGG 

CAATCACACG 

AGCGACGTGC 

TCAAAATGCC 

GAGTCTATCG 

2880 

GCGCGCAATG 

GAAACGGCTA 

CTGGCTGGAC 

TCCTCCACCG 

GCGGAATGGT 

GTGGCGTCCC 

2940 

TCGCCCGGCG 

GCGACTCGCT 

GGAGATGCAA 

AAGGATCACA 

TCGCCGACTA 

TGCGCCGGTC 

3000 

TGCGGTGCCC 

CCGGTTCTCC 

GGCCGGCGGT 

GGCACCTCTT 

CCGGTGGATC 

CGGTGGCGCG 

3060 

GGCAGCGGTG 

CCAGCGGCGG 

CGATGACATT 

CATGGAGGAC 

ACGGCAGCGA 

ACGCAATCAG 

3120 

CAGCGGTACG 

TGGGCGAGTA 

CTCCAACATA 

CCGACCGACT 

ATGCAGAGGT 

GTCCAGTTTT 

3180 

GGCAAGGCAC 

CCAGCGAGTA 

TGGTCGGCAT 

GGCAACGCCT 

CCCCGGCCCC 

TTATGCCACC 

3240 

TCTTCGATCC 

TGAGTCCCCA 

CCAGCAGCAA 

CAGCAGCAGC 

AGCCGCGTTA 

TCAACAGCGA 

3300 

CCAGTGCCCG 

GCTATGGGCT 

CCAGCGCCCA 

ATGCACCCAC 

ACTACCAGCA 

GCAGCAGCAT 

3360 

CAGCAGCAAC 

AGGCGCAGCA 

GACGCACCAG 

CAACACCAGG 

CTCTCCAGCA 

GCACCAGCAA 

3420 

CTGCCACCCA 

GCAACATCTA 

CCAGCAGATG 

TCCACCACCA 

GCGAGATATA 

CCCCACGAAC 

3480 

ACGGGTCCTT 

CGCGCTCTGT 

CTACTCTGAG 

CAGTATTACT 

AC CC CAAGGA 

CAAGCAGAGA 

3540 

CACATCCACA 

TCACCGAGAA 

CAAGCTGAGC 

AACTGC C AC A 

CCTATGAGGC 

GGCTCCTGGC 

3600 

GCCAAGCAGT 

CCTCGCCGAT 

ATCCTCGCAG 

TTCGCCAGCG 

TGAGGCGGCA 

GCAGCTGCCG 

3660 

CCCAACTGCA 

GCATCGGCAG 

GGAAAGTGCC 

CGCTTCAAGG 

TGCTAAACAC 

GGATCAGGGC 

3720 

AAGAACCAGC 

AGAATCTCCT 

GGATCTCGAC 

GGCTCCTCGA 

TGTGCTACAA 

CGGTCTGGCA 

3780 

GACTCGGGCT 

GCGGTGGATC 

TCCCTCCCCG 

ATGGCCATGC 

TGATGTCGCA 

CGAGGACGAG 

3840 

CACGCGCTGT 

ACCACACGGC 

GGATGGGGAT 

CTGGACGACA 

TGGAACGACT 

GTACGTCAAG 

3900 

GTGGACGAGC 

AGCAGCCTCC 

ACAGCAGCAG 

CAGCAGCTGA 

TTCCCCTGGT 

CCCACAGCAT 

3960 

CCGGCGGAAG 

GTCACCTGCA 

GTCCTGGCGG 

AATCAGAGCA 

CGCGGAGCAG 

TCGGAAGAAC 

4020 

GGCCAGGAAT 

GCATCAAGGA 

AC C CAGCGAG 

TTGATCTACG 

CTCCGGGAAG 

CGTGGCCAGC 

4080 

GAACGGAGCC 

TCCTCAGCAA 

CTCGGGTAGC 

GGCACCAGCA 

GCCAGCCAGC 

TGGC CACAAT 

4140 

GTCTGA 






4146 


(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1381 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Gly Glu Asn Pro Arg He He Glu His Pro Met Asp Thr Thr Val Pro 

15 10 15 

Lys Asn Asp Pro Phe Thr Phe Asn Cys Gin Ala Glu Gly Asn Pro Thr 

20 25 30 

Pro Thr He Gin Trp Phe Lys Asp Gly Arg Glu Leu Lys Thr Asp Thr 

35 40 45 

Gly Ser His Arg He Met Leu Pro Ala Gly Gly Leu Phe Phe Leu Lys 

50 55 60 

Val He His Ser Arg Arg Glu Ser Asp Ala Gly Thr Tyr Trp Cys Glu 
65 70 75 80 

Ala Lys Asn Glu Phe Gly Val Ala Arg Ser Arg Asn Ala Thr Leu Gin 

85 90 95 

Val Ala Val Leu Arg Asp Glu Phe Arg Leu Glu Pro Ala Asn Thr Arg 

100 105 110 

Val Ala Gin Gly Glu Val Ala Leu Met Glu Cys Gly Ala Pro Arg Gly 

115 120 125 

Ser Pro Glu Pro Gin He Ser Trp Arg Lys Asn Gly Gin Thr Leu Asn 

130 135 140 

Leu Val Gly Asn Lys Arg He Arg He Val Asp Gly Gly Asn Leu Ala 
145 150 155 160 

He Gin Glu Ala Arg Gin Ser Asp Asp Gly Arg Tyr Gin Cys Val Val 

165 170 175 

Lys Asn Val Val Gly Thr Arg Glu Ser Ala Thr Ala Phe Leu Lys Val 

180 185 190 

His Val Arg Pro Phe Leu He Arg Gly Pro Gin Asn Gin Thr Ala Val 

195 200 205 

Val Gly Ser Ser Val Val Phe Gin Cys Arg He Gly Gly Asp Pro Leu 

210 215 220 

Pro Asp Val Leu Trp Arg Arg Thr Ala Ser Gly Gly Asn Met Pro Leu 
225 230 235 240 

Arg Lys Phe Ser Trp Leu His Ser Ala Ser Gly Arg Val His Val Leu 

245 250 255 

Glu Asp Arg Ser Leu Lys Leu Asp Asp Val Thr Leu Glu Asp Met Gly 
260 265 270 
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Glu Tyr Thr Cys Glu 
275 

Gly He Leu Thr Val 
290 

Asn Gin Leu Val Glu 
305 

Asn Gly His Pro Arg 
325 

Ser Leu Leu Leu Pro 
340 

Thr Pro Glu Gly Arg 
355 

Asp Ser Gly Lys Val 
370 

Val Ser Ser Arg Thr 
385 

Pro Pro He He Glu 
405 

Ser He Val Val Leu 
420 

Val Ser Trp Tyr Leu 
435 

Arg Arg Asn Leu Ser 
450 

Arg His Glu Asp Glu 
465 

Gly Lys Ser Ser Trp 
485 

Pro Asn He Lys Phe 
500 

Pro Pro Gly Lys Pro 
515 

Leu Ser Trp Thr Arg 
530 

Tyr Val He Glu Met 
545 

Val Gly Thr Arg Val 
565 


Ala Asp Asn Ala Val 
280 

His Ala Pro Pro Lys 
295 

He Gly Asp Glu Val 
310 

Pro Thr Leu Tyr Trp 
330 

Gly Tyr Arg Asp Gly 
345 

Ser Val Leu Ser He 
360 

Val Thr Cys Asn Ala 
375 

Val Val Ser Val Asp 
390 

Gin Gly Pro Val Asn 
410 

Pro Cys Arg Thr Leu 
425 

Asp Gly He Pro He 
440 

Asp Ala Gly Ala Leu 
455 

Gly Leu Tyr Thr Cys 
470 

Ser Gly Tyr Leu Arg 
490 

Phe Arg Ala Pro Glu 
505 

Gin Met Val Glu Lys 
520 

Ser Asn Lys Val Gly 
535 

Phe Gly Lys Asn Glu 
550 

Gin Asn Thr Thr Phe 
570 
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Gly Gly He Thr Ala 
285 

Phe Val He Arg Pro 
300 

Leu Phe Glu Cys Gin 
315 

Ser Val Glu Gly Asn 
335 

Arg Met Glu Val Thr 
350 

Ala Arg Phe Ala Arg 
365 

Leu Asn Ala Val Gly 
380 

Thr Gin Phe Glu Leu 
395 

Gin Thr Leu Pro Val 
415 

Gly Thr Pro Val Pro 
430 

Asp Val Gin Glu His 
445 

Thr He Ser Asp Leu 
460 

Val Ala Ser Asn Arg 
475 

Leu Asp Thr Pro Thr 
495 

Leu Ser Thr Tyr Pro 
510 

Gly Glu Asn Ser Val 
525 

Gly Ser Ser Leu Val 
540 

Thr Asp Gly Trp Val 
555 

Thr Gin Thr Gly Leu 
575 


Thr 

Lys 

Ala 
320 
Ser 

Leu 

Glu 

Ser 

Pro 
400 
Lys 

Gin 

Glu 

Gin 

Asn 
480 
Asn 

Gly 

Thr 

Gly 

Ala 
560 
Leu 

B98-006 


Pro Gly Val Asn Tyr 
580 

Leu Ser Leu Pro Ser 
595 

Tyr Phe Asn Ser Gly 
610 

Ser Gly Asp Val Val 
625 

Ser Met Lys Leu Thr 
645 

Phe Tyr Val Tyr Ala 
660 

Ala Pro Val Thr Ser 
675 

Ser Ala Ser Ala Ser 
690 

Asn lie Ala Ala Ala 
705 

Gly Gly Ala Pro Thr 
725 

Leu Asn Gly Gly Gly 
740 

Tyr Thr Leu Tyr Glu 
755 

Gly Lys Pro Ser Asn 
770 

Ser Glu Ala Pro Tyr 
785 

Val Phe Leu Lys Trp 
805 

Leu Leu Asn Tyr His 
820 

Phe Ser Arg lie Leu 
835 

Leu Val Leu Ala Asn 
850 

Ala Ala Gly Asn Asn 
865 


Phe Phe Leu He Arg 
585 

Pro Met Ser Glu Pro 
600 

Leu Asp Leu Ser Glu 
615 

Glu Leu Ser Asn Ala 
630 

Trp Gin He He Asn 
650 

Arg Gin Leu Pro Asn 
665 

Asn Thr Asn Pro Leu 
680 

Ala Ser Ala Ser Ala 
695 

Gly Lys Arg Asp Gly 

710 

Pro Leu Asn Thr Lys 
730 

Ala Ser Ser Cys Thr 
745 

Phe Phe lie Val Pro 
760 

Ser Arg He Ala Arg 
775 

Gly Met Glu Ala Leu 
790 

Lys Ala Pro Glu Leu 
810 

Val He Val Arg Gly 
825 

Thr Asn Val Thr He 
840 

Leu Thr Glu Gly Val 
855 

Ala Gly Val Gly Pro 
870 
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Ala Glu Asn Ser His 
590 

He Thr Val Gly Thr 
605 

Ala Arg Ala Ser Leu 
620 

Ser Val Val Asp Ser 
635 

Gly Lys Tyr Val Glu 
655 

Pro He Val Asn Asn 
670 

Leu Gly Ser Thr Ser 
685 

Leu He Ser Thr Lys 
700 

Glu Thr Asn Gin Ser 
715 

Tyr Arg Met Leu Thr 
735 

He Thr Gly Leu Val 
750 

Phe Tyr Lys Ser Val 
765 

Thr Leu Glu Asp Val 
780 

Leu Leu Asn Ser Ser 
795 

Lys Asp Arg His Gly 
815 

He Asp Thr Ala His 
830 

Asp Ala Ala Ser Pro 
845 

Met Tyr Thr Val Gly 
860 

Tyr Cys Val Pro Ala 
875 


Gly 

Arg 

Leu 

Thr 
640 
Gly 

Pro 

Thr 

Pro 

Gly 
720 
He 

Gin 

Glu 

Pro 

Ala 
800 
Val 

Asn 

Thr 

Val 

Thr 
880 

B98-006 


Leu Arg Leu Asp Pro He Thr Lys Arg Leu Asp Pro Phe He Asn Gin 

885 890 895 

Arg Asp His Val Asn Asp Val Leu Thr Gin Pro Trp Phe He He Leu 

900 905 910 

Leu Gly Ala He Leu Ala Val Leu Met Leu Ser Phe Gly Ala Met Val 

915 920 925 

Phe Val Lys Arg Lys His Met Met Met Lys Gin Ser Ala Leu Asn Thr 

930 935 940 

Met Arg Gly Asn His Thr Ser Asp Val Leu Lys Met Pro Ser Leu Ser 
945 950 955 960 

Ala Arg Asn Gly Asn Gly Tyr Trp Leu Asp Ser Ser Thr Gly Gly Met 

965 970 975 

Val Trp Arg Pro Ser Pro Gly Gly Asp Ser Leu Glu Met Gin Lys Asp 

980 985 990 

His He Ala Asp Tyr Ala Pro Val Cys Gly Ala Pro Gly Ser Pro Ala 

995 1000 1005 

Gly Gly Gly Thr Ser Ser Gly Gly Ser Gly Gly Ala Gly Ser Gly Ala 

1010 1015 1020 

Ser Gly Gly Asp Asp He His Gly Gly His Gly Ser Glu Arg Asn Gin 
1025 1030 1035 1040 

Gin Arg Tyr Val Gly Glu Tyr Ser Asn He Pro Thr Asp Tyr Ala Glu 

1045 1050 1055 

Val Ser Ser Phe Gly Lys Ala Pro Ser Glu Tyr Gly Arg His Gly Asn 

1060 1065 1070 

Ala Ser Pro Ala Pro Tyr Ala Thr Ser Ser He Leu Ser Pro His Gin 

1075 1080 1085 

Gin Gin Gin Gin Gin Gin Pro Arg Tyr Gin Gin Arg Pro Val Pro Gly 

1090 1095 1100 

Tyr Gly Leu Gin Arg Pro Met His Pro His Tyr Gin Gin Gin Gin His 
1105 1110 1115 1120 

Gin Gin Gin Gin Ala Gin Gin Thr His Gin Gin His Gin Ala Leu Gin 

1125 1130 1135 

Gin His Gin Gin Leu Pro Pro Ser Asn He Tyr Gin Gin Met Ser Thr 

1140 1145 1150 

Thr Ser Glu He Tyr Pro Thr Asn Thr Gly Pro Ser Arg Ser Val Tyr 

1155 1160 1165 

Ser Glu Gin Tyr Tyr Tyr Pro Lys Asp Lys Gin Arg His He His He 
1170 1175 1180 
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Thr Glu Asn Lys Leu Ser Asn Cys His Thr Tyr Glu Ala Ala Pro Gly 
1185 1190 1195 1200 

Ala Lys Gin Ser Ser Pro lie Ser Ser Gin Phe Ala Ser Val Arg Arg 

1205 1210 1215 

Gin Gin Leu Pro Pro Asn Cys Ser lie Gly Arg Glu Ser Ala Arg Phe 

1220 1225 1230 

Lys Val Leu Asn Thr Asp Gin Gly Lys Asn Gin Gin Asn Leu Leu Asp 

1235 1240 1245 

Leu Asp Gly Ser Ser Met Cys Tyr Asn Gly Leu Ala Asp Ser Gly Cys 

1250 1255 1260 

Gly Gly Ser Pro Ser Pro Met Ala Met Leu Met Ser His Glu Asp Glu 
1265 1270 1275 1280 

His Ala Leu Tyr His Thr Ala Asp Gly Asp Leu Asp Asp Met Glu Arg 

1285 1290 1295 

Leu Tyr Val Lys Val Asp Glu Gin Gin Pro Pro Gin Gin Gin Gin Gin 

1300 1305 1310 

Leu lie Pro Leu Val Pro Gin His Pro Ala Glu Gly His Leu Gin Ser 

1315 1320 1325 

Trp Arg Asn Gin Ser Thr Arg Ser Ser Arg Lys Asn Gly Gin Glu Cys 

1330 1335 1340 

lie Lys Glu Pro Ser Glu Leu lie Tyr Ala Pro Gly Ser Val Ala Ser 
1345 1350 1355 1360 

Glu Arg Ser Leu Leu Ser Asn Ser Gly Ser Gly Thr Ser Ser Gin Pro 

1365 1370 1375 

Ala Gly His Asn Val 
1380 


(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 894 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
ATGTACTATC TAGGTTTTTA CCACACTCAC ACACACACAC ACACATACAT AAATTTTGAT 6 0 

AAAATTCCTA ATGCCTCAAA TCTCGCTCCC GTGATAATCG AACATCCCAT CGATGTGGTG 12 0 

GTATCTAGGG GATCGCCAGC AACCCTCAAC TGTGGTGCAA AGCCATCTAC CGCCAAAATC 18 0 
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ACATGGTACA AGGATGGACA GCCCGTAATC ACGAATAAGG AGCAAGTGAA CAGCCACCGG 240 

ATTGTTCTCG AC ACGGGAT C CCTGTTTCTT CTGAAAGTGA ATAGTGGAAA AAACGGAAAA 3 00 

GACAGCGATG CGGGAGCGTA CTATTGTGTG GCCAGCAACG AGCACGGAGA AGTGAAGTCG 3 60 

AACGAAGGAT CGTTAAAATT GGCGATGCTT CGCGAAGACT TTCGAGTTCG GCCAAGAACA 42 0 

GTTCAGGCTC TTGGTGGAGA GATGGCCGTT CTGGAATGCA GTCCGCCACG TGGATTCCCG 480 

GAGCCGGTTG TGAGCTGGCG GAAAGACGAC AAAGAGCTCC GAATTCAAGA CATGCCACGA 54 0 

TACACTCTAC ACTCTGACGG AAACCTCATC ATTGATCCGG TCGATCGAAG CGATTCTGGT 60 0 

ACTTATCAGT GTGTTGCCAA CAACATGGTC GGAGAACGGG TGTCCAATCC CGCAAGATTG 660 

AGTGTCTTTG AGAAACCAAA GTTTGAGCAA GAACCCAAGG ACATGACGGT CGACGTCGGA 72 0 

GCCGCAGTGC TGTTTGATTG TCGTGTGACT GGAGATCCTC AACCACAAAT TACGTGGAAA 780 

CGCAAAAATG AGCCGATGCC AGTTACACGT GCATACATTG CCAAGGATAA TCGGGGGTTG 84 0 

AGAATCGAAA GAGTTCAACC AT C AGACGAA GGTGAATACG TTTGCTATGC ACGAAATCCA 90 0 

GCGGGAACTC TTGAAGCATC TGCACATCTT CGTGTCCAGG CACCTCCATC CTTCCAGACA 96 0 

AAACCAGCAG ACCAGTCAGT TCCAGCTGGA GGCACGGCAA CTTTTGAATG CACCTTGGTC 102 0 

GGTCAACCGA GTCCCGCCTA TTTTTGGAGC AAGGAAGGCC AACAGGATCT TCTTTTCCCA 10 80 

AGTTATGTGT CCGCTGATGG TAGAACGAAA GTTTCACCAA CTGGAACATT GACAATTGAG 114 0 

GAAGTTCGTC AAGTTGATGA GGGAGCTTAT GTGTGCGCTG GAATGAACTC GGGAGGAAGC 12 0 0 

TCGTTGAGCA AGGCAGCTTT GAAAGCAACA TTTGAAACCA AAGGCCGTGT CCAAAAAAAA 12 60 

AAGAGCAAAA TGGGCAAACA GAAACAAAAA AATGTTCAAT CAATTATCAA ATATTTAATT 132 0 

TCAGCCGTGA CCGGAAACAC ACCCGCCAAA CCACCACCAA CAATCGAGCA TGGTCATCAA 13 80 

AATCAGACCC TTATGGTTGG AT C ATCAGC C ATCCTTCCAT GTCAGGCTAG CGGAAAACCA 144 0 

ACTCCAGGAA TATCATGGCT CAGGGATGGG CTAC CTATTG ACATTACAGA TAGTCGTATC 15 0 0 

AGTCAACATT CAACGGGAAG TCTACATATT GCCGATTTAA AGAAACCTGA CACCGGAGTT 15 60 

TACACTTGCA TTGCGAAGAA CGAGGATGGA GAGTCAACAT GGTCGGCATC TCTGACTGTT 162 0 

GAAGATCACA CTAGCAATGC ACAATTTGTT CGGATGCCGG ATCCATCGAA CTTCCCGTCT 16 8 0 

TCTCCAACGC AACCCATTAT TGTCAATGTC ACTGATACCG AAGTAGAGCT CCACTGGAAT 174 0 

GCTCCCTCCA CATCTGGCGC AGGACCAATC ACTGGTTATA TCATTCAGTA CTACAGTCCA 18 0 0 

GACCTCGGAC AGACGTGGTT TAACATTCCA GACTACGTGG CATCTACTGA ATATAGAATA 1860 

AAGGGTCTGA AACCATCTCA CTCGTATATG TTTGTGATTC GAGCAGAAAA TGAGAAAGGT 192 0 

ATTGGAACGC CGAGTGTGTC GTCGGCTCTC GTTAC C ACTA GCAAGCCAGC AGCTCAAGTT 1980 

GCGCTTTCTG ACAAGAACAA AATGGACATG GCCATCGCTG AGAAGAGACT CACTTCGGAA 2 04 0 

CAACTCATAA AACTCGAGGA AGTGAAGACT ATTAATTCTA CGGCCGTTCG TTTGTTCTGG 2100 

AAGAAGAGGA AACTTGAAGA GCTGATTGAT GGTTACTACA TCAAGTGGAG AGGGCCTCCA 216 0 

AGAAC CAATG ATAATCAATA CGTGAATGTG ACCAGCCCTA GCACCGAAAA CTATGTTGTT 222 0 

TCAAATTTAA TGCCATTCAC CAACTATGAG TTTTTCGTGA TTCCTTATCA TTCCGGAGTT 22 80 

CATAGTATTC ATGGAGCACC GAGTAATTCC ATGGACGTGT TGACCGCCGA AGCTCCACCT 2 34 0 

TCATTGCCAC CAGAGGATGT GCGAATCCGT ATGCTCAACC TGACCACTCT TCGTATCTCT 24 0 0 

TGGAAAGCAC CAAAAGCCGA CGGCATCAAC GGAATTCTCA AAGGATTCCA AATTGTTATT 2460 
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GTTGGTCAAG 

CGCCCAACAA 

CAATCGGAAC 

ATCACTACAA 

ACGAGAGAGC 

TGCCAGTGTT 

2520 

ACTCTGTTCC 

ATTTAGTGAC 

TGGAATGACG 

TATAAAATTC 

GTGTAGCGGC 

TAGAAGCAAT 

2580 

GGTGGAGTTG 

GAGTCTCACA 

TGGAACGAGT 

GAAGTCATCA 

TGAATCAAGA 

CACGCTGGAA 

2640 

AAACACCTTG 

CTGCTCAACA 

AGAAAACGAA 

TCATTTTTGT 

ATGGGCTGAT 

CAATAAATCT 

2700 

CATGTTCCTG 

TGATTGTCAT 

TGTTGCAATT 

CTGATTATTT 

TCGTAGTCAT 

CATTATAGCC 

2760 

TATTGTTACT 

GGAGGAATAG 

CAGAAACAGT 

GATGGAAAGG 

ATCGAAGTTT 

TATAAAGATC 

2820 

AATGATGGAA 

GTGTTCATAT 

GGCTTCGAAT 

AATCTTTGGG 

ATGTTGCACA 

AAATCCGAAT 

2880 

CAGAATCCAA 

TGTACAACAC 

TGCTGGAAGA 

ATGACTATGA 

ACAATAGAAA 

TGGCCAGGCT 

2940 

CTCTATTCGC 

TGAC AC C AAA 

TGCGCAAGAC 

TTTTTCAACA 

ATTGTGATGA 

CTACAGTGGA 

3000 

ACGATGCACA 

GACCAGGATC 

CGAGC AT CAC 

TATCATTATG 

CTCAACTGAC 

TGGCGGACCT 

3060 

GGTAATGCGA 

TGTCTACTTT 

TTATGGAAAC 

CAATATCACG 

ATGATCCATC 

TCCATATGCC 

3120 

ACCACAACAC 

TGGTCCTGTC 

GAACCAACAA 

CCAGCTTGGC 

TCAATGACAA 

AATGCTTCGC 

3180 

GCGCCAGCAA 

TGCCAACAAA 

TCCCGTGCCA 

CCAGAGCCAC 

CGGCGCGATA 

TGCAGATCAT 

3240 

ACCGCTGGAA 

GACGATCTCG 

ATCGAGCCGT 

GCATCCGATG 

GGAGAGGAAC 

TCTGAATGGC 

3300 

GGACTCCATC 

ACCGGACTAG 

CGGAAGTCAA 

CGGTCGGATA 

GTCCACCTCA 

CACAGATGTG 

3360 

AGCTATGTTC 

AGCTTCACTC 

ATCCGATGGA 

ACTGGTAGTA 

GTAAGGAAAG 

AACTGGGGAG 

3420 

CGGAGAACAC 

C AC CGAATAA 

GACTCTGATG 

GACTTTATTC 

CGCCACCACC 

TTCCAATCCA 

3480 

CCACCACCTG 

GAGGGCACGT 

TTATGACACA 

GCAACTAGGC 

GTCAGTTGAA 

TCGTGGAAGT 

3540 

ACTCCACGAG 

AAGAC AC CT A 

CGATTCGGTC 

AGTGACGGAG 

CTTTTGCTCG 

GGTTGATGTG 

3600 

AATGCAAGGC 

CAACGAGTCG 

GAATCGGAAT 

TTGGGAGGAA 

GGCCGCTGAA 

AGGGAAACGA 

3660 

GACGACGATA 

GTCAGCGGTC 

TTCGTTGATG 

ATGGACGATG 

ATGGTGGATC 

TTCTGAAGCT 

3720 

GACGGGGAGA 

ACTCTGAAGG 

AGACGTTCCG 

CGTGGAGGTG 

TTAGAAAAGC 

AGTTCCTCGA 

3780 

ATGGGTATCT 

CTGCAAGTAC 

GCTGGCTCAT 

AGTTGTTACG 

GGACAAACGG 

CACTGCTCAA 

3840 

CGATTCCGGT 

CAATTCCACG 

TAACAATGGA 

ATCGTCACAC 

AAGAACAAAC 

TTGA 

3894 


(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 97 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Met Tyr Tyr Leu Gly Phe Tyr His Thr His Thr His Thr His Thr Tyr 

15 10 15 

lie Asn Phe Asp Lys lie Pro Asn Ala Ser Asn Leu Ala Pro Val lie 

20 25 30 

He Glu His Pro He Asp Val Val Val Ser Arg Gly Ser Pro Ala Thr 
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35 

Leu Asn Cys Gly Ala 
50 

Asp Gly Gin Pro Val 
65 

lie Val Leu Asp Thr 
85 

Lys Asn Gly Lys Asp 
100 

Asn Glu Hid Gly Glu 
115 

Met Leu Arg Glu Asp 
130 

Gly Gly Glu Met Ala 
145 

Glu Pro Val Val Ser 
165 

Asp Met Pro Arg Tyr 
180 

Pro Val Asp Arg Ser 
195 

Met Val Gly Glu Arg 
210 

Lys Pro Lys Phe Glu 
225 

Ala Ala Val Leu Phe 
245 

He Thr Trp Lys Arg 
260 

He Ala Lys Asp Asn 
275 

Asp Glu Gly Glu Tyr 
290 

Glu Ala Ser Ala His 
305 

Lys Pro Ala Asp Gin 
325 

Cys Thr Leu Val Gly 


40 

Lys Pro Ser Thr Ala 
55 

He Thr Asn Lys Glu 
70 

Gly Ser Leu Phe Leu 
90 

Ser Asp Ala Gly Ala 
105 

Val Lys Ser Asn Glu 
120 

Phe Arg Val Arg Pro 
135 

Val Leu Glu Cys Ser 
150 

Trp Arg Lys Asp Asp 
170 

Thr Leu His Ser Asp 
185 

Asp Ser Gly Thr Tyr 
200 

Val Ser Asn Pro Ala 
215 

Gin Glu Pro Lys Asp 
230 

Asp Cys Arg Val Thr 
250 

Lys Asn Glu Pro Met 
265 

Arg Gly Leu Arg He 
280 

Val Cys Tyr Ala Arg 
295 

Leu Arg Val Gin Ala 
310 

Ser Val Pro Ala Gly 
330 

Gin Pro Ser Pro Ala 

54 


45 

Lys He Thr Trp Tyr 
60 

Gin Val Asn Ser His 
75 

Leu Lys Val Asn Ser 
95 

Tyr Tyr Cys Val Ala 
110 

Gly Ser Leu Lys Leu 
125 

Arg Thr Val Gin Ala 
140 

Pro Pro Arg Gly Phe 
155 

Lys Glu Leu Arg He 
175 

Gly Asn Leu He He 
190 

Gin Cys Val Ala Asn 
205 

Arg Leu Ser Val Phe 
220 

Met Thr Val Asp Val 
235 

Gly Asp Pro Gin Pro 
255 

Pro Val Thr Arg Ala 
270 

Glu Arg Val Gin Pro 
285 

Asn Pro Ala Gly Thr 
300 

Pro Pro Ser Phe Gin 
315 

Gly Thr Ala Thr Phe 
335 

Tyr Phe Trp Ser Lys 


Lys 

Arg 

80 

Gly 

Ser 

Ala 

Leu 

Pro 
160 
Gin 

Asp 

Asn 

Glu 

Gly 
240 
Gin 

Tyr 

Ser 

Leu 

Thr 
320 
Glu 

Glu 
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340 

Gly Gin Gin Asp Leu 
355 

Thr Lys Val Ser Pro 
370 

Val Asp Glu Gly Ala 
385 

Ser Leu Ser Lys Ala 
405 

Val Gin Lys Lys Lys 
420 

Gin Ser lie lie Lys 
435 

Ala Lys Pro Pro Pro 
450 

Met Val Gly Ser Ser 
465 

Thr Pro Gly lie Ser 
485 

Asp Ser Arg lie Ser 
500 

Leu Lys Lys Pro Asp 
515 

Asp Gly Glu Ser Thr 
530 

Ser Asn Ala Gin Phe 
545 

Ser Pro Thr Gin Pro 
565 

Leu His Trp Asn Ala 
580 

Tyr lie He Gin Tyr 
595 

lie Pro Asp Tyr Val 
610 

Pro Ser His Ser Tyr 
625 

He Gly Thr Pro Ser 


345 

Leu Phe Pro Ser Tyr 
360 

Thr Gly Thr Leu Thr 
375 

Tyr Val Cys Ala Gly 
390 

Ala Leu Lys Ala Thr 
410 

Ser Lys Met Gly Lys 
425 

Tyr Leu lie Ser Ala 
440 

Thr He Glu His Gly 
455 

Ala He Leu Pro Cys 
470 

Trp Leu Arg Asp Gly 
490 

Gin His Ser Thr Gly 
505 

Thr Gly Val Tyr Thr 
520 

Trp Ser Ala Ser Leu 
535 

Val Arg Met Pro Asp 
550 

He He Val Asn Val 
570 

Pro Ser Thr Ser Gly 
585 

Tyr Ser Pro Asp Leu 
600 

Ala Ser Thr Glu Tyr 
615 

Met Phe Val He Arg 
630 

Val Ser Ser Ala Leu 
55 


350 

Val Ser Ala Asp Gly 
365 

He Glu Glu Val Arg 
380 

Met Asn Ser Ala Gly 
395 

Phe Glu Thr Lys Gly 
415 

Gin Lys Gin Lys Asn 
430 

Val Thr Gly Asn Thr 
445 

His Gin Asn Gin Thr 
460 

Gin Ala Ser Gly Lys 
475 

Leu Pro He Asp He 
495 

Ser Leu His He Ala 
510 

Cys He Ala Lys Asn 
525 

Thr Val Glu Asp His 
540 

Pro Ser Asn Phe Pro 
555 

Thr Asp Thr Glu Val 
575 

Ala Gly Pro He Thr 
590 

Gly Gin Thr Trp Phe 
605 

Arg He Lys Gly Leu 
620 

Ala Glu Asn Glu Lys 
635 

Val Thr Thr Ser Lys 


Arg 

Gin 

Ser 
400 
Arg 

Val 

Pro 

Leu 

Pro 
480 
Thr 

Asp 

Glu 

Thr 

Ser 
560 
Glu 

Gly 

Asn 

Lys 

Gly 
640 
Pro 
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645 650 655 

Ala Ala Gin Val Ala Leu Ser Asp Lys Asn Lys Met Asp Met Ala lie 

660 665 670 

Ala Glu Lys Arg Leu Thr Ser Glu Gin Leu lie Lys Leu Glu Glu Val 

675 680 685 

Lys Thr lie Asn Ser Thr Ala Val Arg Leu Phe Trp Lys Lys Arg Lys 

690 695 700 

Leu Glu Glu Leu lie Asp Gly Tyr Tyr lie Lys Trp Arg Gly Pro Pro 
705 710 715 720 

Arg Thr Asn Asp Asn Gin Tyr Val Asn Val Thr Ser Pro Ser Thr Glu 

725 730 735 

Asn Tyr Val Val Ser Asn Leu Met Pro Phe Thr Asn Tyr Glu Phe Phe 

740 745 750 

Val lie Pro Tyr His Ser Gly Val His Ser lie His Gly Ala Pro Ser 

755 760 765 

Asn Ser Met Asp Val Leu Thr Ala Glu Ala Pro Pro Ser Leu Pro Pro 

770 775 780 

Glu Asp Val Arg lie Arg Met Leu Asn Leu Thr Thr Leu Arg lie Ser 
785 790 795 800 

Trp Lys Ala Pro Lys Ala Asp Gly lie Asn Gly lie Leu Lys Gly Phe 

805 810 815 

Gin lie Val lie Val Gly Gin Ala Pro Asn Asn Asn Arg Asn lie Thr 

820 825 830 

Thr Asn Glu Arg Ala Ala Ser Val Thr Leu Phe His Leu Val Thr Gly 

835 840 845 

Met Thr Tyr Lys lie Arg Val Ala Ala Arg Ser Asn Gly Gly Val Gly 

850 855 860 

Val Ser His Gly Thr Ser Glu Val lie Met Asn Gin Asp Thr Leu Glu 
865 870 875 880 

Lys His Leu Ala Ala Gin Gin Glu Asn Glu Ser Phe Leu Tyr Gly Leu 

885 890 895 

lie Asn Lys Ser His Val Pro Val lie Val lie Val Ala lie Leu lie 

900 905 910 

lie Phe Val Val He He He Ala Tyr Cys Tyr Trp Arg Asn Ser Arg 

915 920 925 

Asn Ser Asp Gly Lys Asp Arg Ser Phe He Lys He Asn Asp Gly Ser 

930 935 940 

Val His Met Ala Ser Asn Asn Leu Trp Asp Val Ala Gin Asn Pro Asn 
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945 950 955 960 

Gin Asn Pro Met Tyr Asn Thr Ala Gly Arg Met Thr Met Asn Asn Arg 

965 970 975 

Asn Gly Gin Ala Leu Tyr Ser Leu Thr Pro Asn Ala Gin Asp Phe Phe 

980 985 990 

Asn Asn Cys Asp Asp Tyr Ser Gly Thr Met His Arg Pro Gly Ser Glu 

995 1000 1005 

His His Tyr His Tyr Ala Gin Leu Thr Gly Gly Pro Gly Asn Ala Met 

1010 1015 1020 

Ser Thr Phe Tyr Gly Asn Gin Tyr His Asp Asp Pro Ser Pro Tyr Ala 
1025 1030 1035 1040 

Thr Thr Thr Leu Val Leu Ser Asn Gin Gin Pro Ala Trp Leu Asn Asp 

1045 1050 1055 

Lys Met Leu Arg Ala Pro Ala Met Pro Thr Asn Pro Val Pro Pro Glu 

1060 1065 1070 

Pro Pro Ala Arg Tyr Ala Asp His Thr Ala Gly Arg Arg Ser Arg Ser 

1075 1080 1085 

Ser Arg Ala Ser Asp Gly Arg Gly Thr Leu Asn Gly Gly Leu His His 

1090 1095 1100 

Arg Thr Ser Gly Ser Gin Arg Ser Asp Ser Pro Pro His Thr Asp Val 
1105 1110 1115 1120 

Ser Tyr Val Gin Leu His Ser Ser Asp Gly Thr Gly Ser Ser Lys Glu 

1125 1130 1135 

Arg Thr Gly Glu Arg Arg Thr Pro Pro Asn Lys Thr Leu Met Asp Phe 

1140 1145 1150 

lie Pro Pro Pro Pro Ser Asn Pro Pro Pro Pro Gly Gly His Val Tyr 

1155 1160 1165 

Asp Thr Ala Thr Arg Arg Gin Leu Asn Arg Gly Ser Thr Pro Arg Glu 

1170 1175 1180 

Asp Thr Tyr Asp Ser Val Ser Asp Gly Ala Phe Ala Arg Val Asp Val 
1185 1190 1195 1200 

Asn Ala Arg Pro Thr Ser Arg Asn Arg Asn Leu Gly Gly Arg Pro Leu 

1205 1210 1215 

Lys Gly Lys Arg Asp Asp Asp Ser Gin Arg Ser Ser Leu Met Met Asp 

1220 1225 1230 

Asp Asp Gly Gly Ser Ser Glu Ala Asp Gly Glu Asn Ser Glu Gly Asp 

1235 1240 1245 

Val Pro Arg Gly Gly Val Arg Lys Ala Val Pro Arg Met Gly lie Ser 
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1250 1255 1260 

Ala Ser Thr Leu Ala His Ser Cys Tyr Gly Thr Asci Gly Thr Ala Gin 
1265 1270 1275 1280 

Arg Phe Arg Ser lie Pro Arg Asn Asn Gly lie Val Thr Gin Glu Gin 
1285 1290 1295 

Thr 


(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4956 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 


ATGAAATGGA 

AACATGTTCC 

TTTTTTGGTC 

ATGATAT CAC 

TCCTCAGCTT 

ATCCCCAAAT 

60 

CACCTGTTTC 

TGGCCCAGCT 

TATTCCAGAC 

CCTGAAGATG 

TAGAGAGGGG 

GAACGAC CAC 

120 

GGGACGC CAA 

TCCCCACCTC 

TGATAACGAT 

GACAATTCGC 

TGGGCTATAC 

AGGCTCCCGT 

180 

CTTCGTCAGG 

AAGATTTTCC 

ACCTCGCATT 

GTTGAACACC 

CTTCAGACCT 

GATTGTCTCA 

240 

AAAGGAGAAC 

CTGCAACTTT 

GAACTGCAAA 

GCTGAAGGCC 

GCCCCACACC 

CACTATTGAA 

300 

TGGTACAAAG 

GGGGAGAGAG 

AGTGGAGACA 

GACAAAGATG 

ACCCTCGCTC 

ACAC CGAATG 

360 

TTGCTGCCGA 

GTGGATCTTT 

ATTTTTCTTA 

CGTATAGTAC 

ATGGACGGAA 

AAGTAGACCT 

420 

GATGAAGGAG 

TCTATGTCTG 

TGTAGCAAGG 

AATTACCTTG 

GAGAGGCTGT 

GAGCCACAAT 

480 

GCATCGCTGG 

AAGTAGC CAT 

ACTTCGGGAT 

GACTTCAGAC 

AAAACCCTTC 

GGATGTCATG 

540 

GTTGCAGTAG 

GAGAGCCTGC 

AGTAATGGAA 

TGCCAACCTC 

CACGAGGCCA 

TCCTGAGCCC 

600 

ACCATTTCAT 

GGAAGAAAGA 

TGGCTCTCCA 

CTGGATGATA 

AAGATGAAAG 

AATAACTATA 

660 

CGAGGAGGAA 

AGCTCATGAT 

CACTTACACC 

CGTAAAAGTG 

ACGCTGGCAA 

ATATGTTTGT 

720 

GTTGGTACCA 

ATATGGTTGG 

GGAACGTGAG 

AGTGAAGTAG 

CCGAGCTGAC 

TGTCTTAGAG 

780 

AGACCATCAT 

TTGTGAAGAG 

ACCCAGTAAC 

TTGGCAGTAA 

CTGTGGATGA 

CAGTGCAGAA 

840 

TTTAAATGTG 

AGGCCCGAGG 

TGACCCTGTA 

CCTACAGTAC 

GATGGAGGAA 

AGATGATGGA 

900 

GAGCTGCCCA 

AATCCAGATA 

TGAAATCCGA 

GATGATCATA 

CCTTGAAAAT 

TAGGAAGGTG 

960 

ACAGCTGGTG 

ACATGGGTTC 

ATACACTTGT 

GTTGCAGAAA 

ATATGGTGGG 

CAAAGCTGAA 

1020 

GCATCTGCTA 

CTCTGACTGT 

TCAAGAACCT 

CCACATTTTG 

TTGTGAAACC 

CCGTGACCAG 

1080 

GTTGTTGCTT 

TGGGACGGAC 

TGTAACTTTT 

CAGTGTGAAG 

CAACCGGAAA 

TCCTCAACCA 

1140 

GCTATTTTCT 

GGAGGAGAGA 

AGGGAGTCAG 

AATCTACTTT 

TCTCATATCA 

ACCACCACAG 

1200 

TCATCCAGCC 

GATTTTCAGT 

CTCCCAGACT 

GGCGACCTCA 

CAATTACTAA 

TGTCCAGCGA 

1260 

TCTGATGTTG 

GTTATTACAT 

CTGCCAGACT 

TTAAATGTTG 

CTGGAAGCAT 

CATCACAAAG 

1320 

GCATATTTGG 

AAGTTACAGA 

TGTGATTGCA 

GATCGGCCTC 

CCCCAGTTAT 

TCGACAAGGT 

1380 
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1560 
1620 


CCTGTGAATC AGACTGTAGC CGTGGATGGC ACTTTCGTCC TCAGCTGTGT GGCCACAGGC 144 0 

AGTCCAGTGC CCACCATTCT GTGGAGAAAG GATGGAGTCC TCGTTTCAAC CCAAGACTCT 15 00 
CGAATCAAAC AGTTGGAGAA TGGAGTACTG CAGATCCGAT ATGCTAAGCT GGGTGATACT 
GGTCGGTACA CCTGCATTGC ATCAACCCCC AGTGGTGAAG CAACATGGAG TGCTTACATT 

GAAGTTCAAG AATTTGGAGT TCCAGTTCAG CCTCCAAGAC CTACTGACCC AAATTTAATC 168 0 

CCTAGTGCCC CATC AAAAC C TGAAGTGACA GATGTCAGCA GAAATACAGT CACATTATCG 174 0 

TGGCAACCAA ATTTGAATTC AGGAGCAACT CCAACATCTT ATATTATAGA AGCCTTCAGC 1800 

CATGCATCTG GTAGCAGCTG GCAGACCGTA GCAGAGAATG TGAAAACAGA AACATCTGCC 1860 

ATTAAAGGAC TCAAACCTAA TGCAATTTAC CTTTTCCTTG TGAGGGCAGC TAATGCATAT 192 0 

GGAATTAGTG ATCCAAGCCA AATATCAGAT CCAGTGAAAA CACAAGATGT CCTACCAACA 198 0 

AGTCAGGGGG TGGACCACAA GCAGGTCCAG AGAGAGCTGG GAAATGCTGT TCTGCACCTC 2 040 

CACAACCCCA CCGTCCTTTC TTCCTCTTCC ATCGAAGTGC ACTGGACAGT AGATCAACAG 210 0 

TCTCAGTATA TACAAGGATA TAAAATTCTC TATCGGCCAT CTGGAGCCAA CCACGGAGAA 2160 

TCAGACTGGT TAGTTTTTGA AGTGAGGACG CCAGCCAAAA ACAGTGTGGT AATCCCTGAT 222 0 

CTCAGAAAGG GAGTCAACTA TGAAATTAAG GCTCGCCCTT TTTTTAATGA ATTTCAAGGA 2280 

GCAGATAGTG AAATCAAGTT TGCCAAAACC CTGGAAGAAG CACCCAGTGC CCCACCCCAA 2 34 0 

GGTGTAACTG TATCCAAGAA TGATGGAAAC GGAACTGCAA TTCTAGTTAG TTGGCAGCCA 24 0 0 

CCTCCAGAAG ACACTCAAAA TGGAATGGTC CAAGAGTATA AGGTTTGGTG TCTGGGCAAT 24 60 

GAAACTCGAT ACCACATCAA CAAAACAGTG GATGGTTCCA CCTTTTCCGT GGTCATTCCC 2 52 0 

TTTCTTGTTC CTGGAATCCG ATACAGTGTG GAAGTGGCAG CCAGCACTGG GGCTGGGTCT 25 80 

GGGGTAAAGA GTGAGCCTCA GTTCATCCAG CTGGATGCCC ATGGAAACCC TGTGTCACCT 2 64 0 

GAGGACCAAG TCAGCCTCGC TCAGCAGATT TCAGATGTGG TGAAGCAGCC GGCCTTCATA 270 0 

GCAGGTATTG GAGCAGCCTG TTGGATCATC CTCATGGTCT TCAGCATCTG GCTTTATCGA 2760 

CACCGCAAGA AGAGAAACGG ACTTACTAGT ACCTACGCGG GTATCAGAAA AGTCCCGTCT 2 82 0 

TTTACCTTCA CACCAACAGT AACTTACCAG AGAGGAGGCG AAGCTGTCAG CAGTGGAGGG 2 88 0 

AGGCCTGGAC TTCTCAACAT CAGTGAACCT GCCGCGCAGC CATGGCTGGC AGACACGTGG 2 94 0 

CCTAATACTG GCAACAAC C A CAATGACTGC TCCATCAGCT GCTGCACGGC AGGCAATGGA 3 0 00 

AACAGCGACA GCAACCTCAC TACCTACAGT CGCCCAGCTG ATTGTATAGC AAATTATAAC 3 060 

AACCAACTGG ATAACAAACA AACAAATCTG ATGCTCCCTG AGTCAACTGT TTATGGTGAT 312 0 

GTGGACCTTA GTAACAAAAT CAATGAGATG AAAACCTTCA AT AGC C C AAA TCTGAAGGAT 318 0 

GGGCGTTTTG TCAATCCATC AGGGCAGCCT ACTCCTTACG CCACCACTCA GCTCATCCAG 3 24 0 

TCAAACCTCA GCAACAACAT GAACAATGGC AGCGGGGACT CTGGCGAGAA GCACTGGAAA 3 3 00 

CCACTGGGAC AGCAGAAACA AGAAGTGGCA CCAGTTCAGT ACAACATCGT GGAGCAAAAC 336 0 

AAGCTGAACA AAGATTATCG AGCAAATGAC ACAGTTCCTC CAACTATCCC ATACAACCAA 3 42 0 

TCATACGACC AGAACACAGG AGGATCCTAC AACAGCTCAG ACCGGGGCAG TAGTACATCT 34 80 

GGGAGTCAGG GGCACAAGAA AGGGGCAAGA ACACCCAAGG TACCAAAACA GGGTGGCATG 3 54 0 

AACTGGGCAG ACCTGCTTCC TCCTCCCCCA GCACATCCTC CTCCACACAG CAATAGCGAA 3 6 00 
GAGTACAACA TTTCTGTAGA TGAAAGCTAT GACCAAGAAA TGCCATGTCC CGTGCCACCA 


3660 
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GCAAGGATGT 

ATTTGCAACA 

AGATGAATTA 

GAAGAGGAGG 

AAGATGAACG 

AGGCCCCACT 

3720 

CCCCCTGTTC 

GGGGAGCAGC 

TTCTTCTCCA 

GCTGCCGTGT 

C CTATAGC C A 

TCAGTCCACT 

3780 

GCCACTCTGA 

CTCCCTCCCC 

ACAGGAAGAA 

CTCCAGCCCA 

TGTTACAGGA 

TTGTCCAGAG 

3840 

GAGACTGGCC 

ACATGCAGCA 

CCAGCCCGAC 

AGGAGACGGC 

AGCCTGTGAG 

TCCTCCTCCA 

3900 

CCACCACGGC 

CGATCTCCCC 

TCCACATACC 

TATGGCTACA 

TTTCAGGACC 

CCTGGTCTCA 

3960 

GATATGGATA 

CGGATGCGCC 

AGAAGAGGAA 

GAAGACGAAG 

CCGACATGGA 

GGTAGCCAAG 

4020 

ATGCAAACCA 

GAAGGCTTTT 

GTTACGTGGG 

CTTGAGCAGA 

CACCTGCCTC 

CAGTGTTGGG 

4080 

GAC CTGGAGA 

GCTCTGTCAC 

GGGGTCCATG 

ATCAACGGCT 

GGGGCTCAGC 

CTCAGAGGAG 

4140 

GACAACATTT 

CCAGCGGACG 

CTCCAGTGTT 

AGTTCTTCGG 

ACGGCTCCTT 

TTTCACTGAT 

4200 

GCTGACTTTG 

CCCAGGCAGT 

CGCAGCAGCG 

GCAGAGTATG 

CTGGTCTGAA 

AGTAGCACGA 

4260 

CGGCAAATGC 

AGGATGCTGC 

TGGCCGTCGA 

CATTTTCATG 

CGTCTCAGTG 

CCCTAGGCCC 

4320 

ACAAGTCCCG 

TGTCTACAGA 

CAGCAACATG 

AGTGCCGCCG 

TAATGCAGAA 

AACCAGACCA 

4380 

GCCAAGAAAC 

TGAAAC AC C A 

GCCAGGACAT 

CTGCGCAGAG 

AAAC CTACAC 

AGATGATCTT 

4440 

CCACCACCTC 

CTGTGCCGCC 

ACCTGCTATA 

AAGTCACCTA 

CTGCCCAATC 

CAAGACACAG 

4500 

CTGGAAGTAC 

GACCTGTAGT 

GGTGCCAAAA 

CTCCCTTCTA 

TGGATGCAAG 

AACAGACAGA 

4560 

TCATCAGACA 

GAAAAGGAAG 

CAGTTACAAG 

GGGAGAGAAG 

TGTTGGATGG 

AAGACAGGTT 

4620 

GTTGACATGC 

GAACAAATCC 

AGGTGATCCC 

AGAGAAGCAC 

AGGAACAGCA 

AAATGACGGG 

4680 

AAAGGACGTG 

GAAACAAGGC 

AGCAAAACGA 

GACCTTCCAC 

CAGCAAAGAC 

TCATCTCATC 

4740 

CAAGAGGATA 

TTCTAGCTTA 

TTGTAGACCT 

ACTTTTCCAA 

CATCAAATAA 

TC C CAGAGAT 

4800 

CCCAGTTCCT 

CAAGCTCAAT 

GTCAT CAAGA 

GGATCAGGAA 

GCAGACAAAG 

AGAACAAGCA 

4860 

AATGTAGGTC 

GAAGAAATAT 

TGCAGAAATG 

CAGGTACTTG 

GAGGATATGA 

AAGAGGAGAA 

4920 

GATAATAATG 

AAGAATTAGA 

GGAAACTGAA 

AGCTGA 



4956 


(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1651 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

Met Lys Trp Lys His Val Pro Phe Leu Val Met lie Ser Leu Leu Ser 

15 10 15 

Leu Ser Pro Asn His Leu Phe Leu Ala Gin Leu lie Pro Asp Pro Glu 

20 25 30 

Asp Val Glu Arg Gly Asn Asp His Gly Thr Pro lie Pro Thr Ser Asp 

35 40 45 

Asn Asp Asp Asn Ser Leu Gly Tyr Thr Gly Ser Arg Leu Arg Gin Glu 
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50 55 60 

Asp Phe Pro Pro Arg lie Val Glu His Pro Ser Asp Leu lie Val Ser 
65 70 75 80 

Lys Gly Glu Pro Ala Thr Leu Asn Cys Lys Ala Glu Gly Arg Pro Thr 

85 90 95 

Pro Thr lie Glu Trp Tyr Lys Gly Gly Glu Arg Val Glu Thr Asp Lys 

100 105 110 

Asp Asp Pro Arg Ser His Arg Met Leu Leu Pro Ser Gly Ser Leu Phe 

115 120 125 

Phe Leu Arg lie Val His Gly Arg Lys Ser Arg Pro Asp Glu Gly Val 

130 135 140 

Tyr Val Cys Val Ala Arg Asn Tyr Leu Gly Glu Ala Val Ser His Asn 
145 150 155 160 

Ala Ser Leu Glu Val Ala He Leu Arg Asp Asp Phe Arg Gin Asn Pro 

165 170 175 

Ser Asp Val Met Val Ala Val Gly Glu Pro Ala Val Met Glu Cys Gin 

180 185 190 

Pro Pro Arg Gly His Pro Glu Pro Thr He Ser Trp Lys Lys Asp Gly 

195 200 205 

Ser Pro Leu Asp Asp Lys Asp Glu Arg He Thr He Arg Gly Gly Lys 

210 215 220 

Leu Met He Thr Tyr Thr Arg Lys Ser Asp Ala Gly Lys Tyr Val Cys 
225 230 235 240 

Val Gly Thr Asn Met Val Gly Glu Arg Glu Ser Glu Val Ala Glu Leu 

245 250 255 

Thr Val Leu Glu Arg Pro Ser Phe Val Lys Arg Pro Ser Asn Leu Ala 

260 265 270 

Val Thr Val Asp Asp Ser Ala Glu Phe Lys Cys Glu Ala Arg Gly Asp 

275 280 285 

Pro Val Pro Thr Val Arg Trp Arg Lys Asp Asp Gly Glu Leu Pro Lys 

290 295 300 

Ser Arg Tyr Glu He Arg Asp Asp His Thr Leu Lys He Arg Lys Val 
305 310 315 320 

Thr Ala Gly Asp Met Gly Ser Tyr Thr Cys Val Ala Glu Asn Met Val 

325 330 335 

Gly Lys Ala Glu Ala Ser Ala Thr Leu Thr Val Gin Glu Pro Pro His 

340 345 350 

Phe Val Val Lys Pro Arg Asp Gin Val Val Ala Leu Gly Arg Thr Val 
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355 

Thr Phe Gin Cys 
370 

Arg Arg Glu Gly 
385 

Ser Ser Ser Arg 

Asn Val Gin Arg 
420 

Val Ala Gly Ser 
435 

lie Ala Asp Arg 
450 

Thr Val Ala Val 
465 

Ser Pro Val Pro 

Thr Gin Asp Ser 
500 

Arg Tyr Ala Lys 
515 

Thr Pro Ser Gly 
530 

Phe Gly Val Pro 
545 

Pro Ser Ala Pro 

Val Thr Leu Ser 
580 

Ser Tyr lie lie 
595 

Thr Val Ala Glu 
610 

Lys Pro Asn Ala 
625 

Gly lie Ser Asp 
Val Leu Pro Thr 


360 

Glu Ala Thr Gly 
375 

Ser Gin Asn Leu 
390 

Phe Ser Val Ser 
405 

Ser Asp Val Gly 

lie lie Thr Lys 
440 

Pro Pro Pro Val 
455 

Asp Gly Thr Phe 
470 

Thr lie Leu Trp 
485 

Arg lie Lys Gin 

Leu Gly Asp Thr 
520 

Glu Ala Thr Trp 
535 

Val Gin Pro Pro 
550 

Ser Lys Pro Glu 
565 

Trp Gin Pro Asn 

Glu Ala Phe Ser 
600 

Asn Val Lys Thr 
615 

lie Tyr Leu Phe 
630 

Pro Ser Gin lie 
645 

Ser Gin Gly Val 


Asn Pro Gin Pro 
380 

Leu Phe Ser Tyr 
395 

Gin Thr Gly Asp 
410 

Tyr Tyr lie Cys 
425 

Ala Tyr Leu Glu 

lie Arg Gin Gly 
460 

Val Leu Ser Cys 
475 

Arg Lys Asp Gly 
490 

Leu Glu Asn Gly 
505 

Gly Arg Tyr Thr 

Ser Ala Tyr lie 
540 

Arg Pro Thr Asp 
555 

Val Thr Asp Val 
570 

Leu Asn Ser Gly 
585 

His Ala Ser Gly 

Glu Thr Ser Ala 
620 

Leu Val Arg Ala 
635 

Ser Asp Pro Val 
650 

Asp His Lys Gin 


365 

Ala lie Phe Trp 

Gin Pro Pro Gin 
400 

Leu Thr lie Thr 
415 

Gin Thr Leu Asn 
430 

Val Thr Asp Val 
445 

Pro Val Asn Gin 

Val Ala Thr Gly 
480 

Val Leu Val Ser 
495 

Val Leu Gin lie 
510 

Cys lie Ala Ser 
525 

Glu Val Gin Glu 

Pro Asn Leu lie 
560 

Ser Arg Asn Thr 
575 

Ala Thr Pro Thr 
590 

Ser Ser Trp Gin 
605 

lie Lys Gly Leu 

Ala Asn Ala Tyr 
640 

Lys Thr Gin Asp 
655 

Val Gin Arg Glu 
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660 

Leu Gly Asn Ala 
675 

Ser Ser lie Glu 
690 

Gin Gly Tyr Lys 
705 

Ser Asp Trp Leu 

Val lie Pro Asp 
740 

Pro Phe Phe Asn 
755 

Lys Thr Leu Glu 
770 

Ser Lys Asn Asp 
785 

Pro Pro Glu Asp 

Cys Leu Gly Asn 
820 

Ser Thr Phe Ser 
835 

Ser Val Glu Val 
850 

Glu Pro Gin Phe 
865 

Glu Asp Gin Val 

Pro Ala Phe He 
900 

Val Phe Ser He 
915 

Thr Ser Thr Tyr 
930 

Pro Thr Val Thr 
945 

Arg Pro Gly Leu 


Val Leu His Leu 
680 

Val His Trp Thr 
695 

He Leu Tyr Arg 
710 

Val Phe Glu Val 
725 

Leu Arg Lys Gly 

Glu Phe Gin Gly 
760 

Glu Ala Pro Ser 
775 

Gly Asn Gly Thr 
790 

Thr Gin Asn Gly 
805 

Glu Thr Arg Tyr 

Val Val He Pro 
840 

Ala Ala Ser Thr 
855 

He Gin Leu Asp 
870 

Ser Leu Ala Gin 
885 

Ala Gly He Gly 

Trp Leu Tyr Arg 
920 

Ala Gly He Arg 
935 

Tyr Gin Arg Gly 
950 

Leu Asn He Ser 


665 

His Asn Pro Thr 

Val Asp Gin Gin 
700 

Pro Ser Gly Ala 
715 

Arg Thr Pro Ala 
730 

Val Asn Tyr Glu 
745 

Ala Asp Ser Glu 

Ala Pro Pro Gin 
780 

Ala He Leu Val 
795 

Met Val Gin Glu 
810 

His He Asn Lys 
825 

Phe Leu Val Pro 

Gly Ala Gly Ser 
860 

Ala His Gly Asn 
875 

Gin He Ser Asp 
890 

Ala Ala Cys Trp 
905 

His Arg Lys Lys 

Lys Val Pro Ser 
940 

Gly Glu Ala Val 
955 

Glu Pro Ala Ala 


670 

Val Leu Ser Ser 
685 

Ser Gin Tyr He 

Asn His Gly Glu 
720 

Lys Asn Ser Val 
735 

He Lys Ala Arg 
750 

He Lys Phe Ala 
765 

Gly Val Thr Val 

Ser Trp Gin Pro 
800 

Tyr Lys Val Trp 
815 

Thr Val Asp Gly 
830 

Gly He Arg Tyr 
845 

Gly Val Lys Ser 

Pro Val Ser Pro 
880 

Val Val Lys Gin 
895 

He He Leu Met 
910 

Arg Asn Gly Leu 
925 

Phe Thr Phe Thr 

Ser Ser Gly Gly 
960 

Gin Pro Trp Leu 
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965 970 975 

Ala Asp Thr Trp Pro Asn Thr Gly Asn Asn His Asn Asp Cys Ser lie 

980 985 990 

Ser Cys Cys Thr Ala Gly Asn Gly Asn Ser Asp Ser Asn Leu Thr Thr 

995 1000 1005 

Tyr Ser Arg Pro Ala Asp Cys lie Ala Asn Tyr Asn Asn Gin Leu Asp 

1010 1015 1020 

Asn Lys Gin Thr Asn Leu Met Leu Pro Glu Ser Thr Val Tyr Gly Asp 
1025 1030 1035 1040 

Val Asp Leu Ser Asn Lys lie Asn Glu Met Lys Thr Phe Asn Ser Pro 

1045 1050 1055 

Asn Leu Lys Asp Gly Arg Phe Val Asn Pro Ser Gly Gin Pro Thr Pro 

1060 1065 1070 

Tyr Ala Thr Thr Gin Leu lie Gin Ser Asn Leu Ser Asn Asn Met Asn 

1075 1080 1085 

Asn Gly Ser Gly Asp Ser Gly Glu Lys His Trp Lys Pro Leu Gly Gin 

1090 1095 1100 

Gin Lys Gin Glu Val Ala Pro Val Gin Tyr Asn lie Val Glu Gin Asn 
1105 1110 1115 1120 

Lys Leu Asn Lys Asp Tyr Arg Ala Asn Asp Thr Val Pro Pro Thr lie 

1125 1130 1135 

Pro Tyr Asn Gin Ser Tyr Asp Gin Asn Thr Gly Gly Ser Tyr Asn Ser 

1140 1145 1150 

Ser Asp Arg Gly Ser Ser Thr Ser Gly Ser Gin Gly His Lys Lys Gly 

1155 1160 1165 

Ala Arg Thr Pro Lys Val Pro Lys Gin Gly Gly Met Asn Trp Ala Asp 

1170 1175 1180 

Leu Leu Pro Pro Pro Pro Ala His Pro Pro Pro His Ser Asn Ser Glu 
1185 1190 1195 1200 

Glu Tyr Asn lie Ser Val Asp Glu Ser Tyr Asp Gin Glu Met Pro Cys 

1205 1210 1215 

Pro Val Pro Pro Ala Arg Met Tyr Leu Gin Gin Asp Glu Leu Glu Glu 

1220 1225 1230 

Glu Glu Asp Glu Arg Gly Pro Thr Pro Pro Val Arg Gly Ala Ala Ser 

1235 1240 1245 

Ser Pro Ala Ala Val Ser Tyr Ser His Gin Ser Thr Ala Thr Leu Thr 

1250 1255 1260 

Pro Ser Pro Gin Glu Glu Leu Gin Pro Met Leu Gin Asp Cys Pro Glu 
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1265 1270 1275 1280 

Glu Thr Gly His Met Gin His Gin Pro Asp Arg Arg Arg Gin Pro Val 

1285 1290 1295 

Ser Pro Pro Pro Pro Pro Arg Pro lie Ser Pro Pro His Thr Tyr Gly 

1300 1305 1310 

Tyr lie Ser Gly Pro Leu Val Ser Asp Met Asp Thr Asp Ala Pro Glu 

1315 1320 1325 

Glu Glu Glu Asp Glu Ala Asp Met Glu Val Ala Lys Met Gin Thr Arg 

1330 1335 1340 

Arg Leu Leu Leu Arg Gly Leu Glu Gin Thr Pro Ala Ser Ser Val Gly 
1345 1350 1355 1360 

Asp Leu Glu Ser Ser Val Thr Gly Ser Met He Asn Gly Trp Gly Ser 

1365 1370 1375 

Ala Ser Glu Glu Asp Asn He Ser Ser Gly Arg Ser Ser Val Ser Ser 

1380 1385 1390 

Ser Asp Gly Ser Phe Phe Thr Asp Ala Asp Phe Ala Gin Ala Val Ala 

1395 1400 1405 

Ala Ala Ala Glu Tyr Ala Gly Leu Lys Val Ala Arg Arg Gin Met Gin 

1410 1415 1420 

Asp Ala Ala Gly Arg Arg His Phe His Ala Ser Gin Cys Pro Arg Pro 
1425 1430 1435 1440 

Thr Ser Pro Val Ser Thr Asp Ser Asn Met Ser Ala Ala Val Met Gin 

1445 1450 1455 

Lys Thr Arg Pro Ala Lys Lys Leu Lys His Gin Pro Gly His Leu Arg 

1460 1465 1470 

Arg Glu Thr Tyr Thr Asp Asp Leu Pro Pro Pro Pro Val Pro Pro Pro 

1475 1480 1485 * 

Ala He Lys Ser Pro Thr Ala Gin Ser Lys Thr Gin Leu Glu Val Arg 

1490 1495 1500 

Pro Val Val Val Pro Lys Leu Pro Ser Met Asp Ala Arg Thr Asp Arg 
1505 1510 1515 1520 

Ser Ser Asp Arg Lys Gly Ser Ser Tyr Lys Gly Arg Glu Val Leu Asp 

1525 1530 1535 

Gly Arg Gin Val Val Asp Met Arg Thr Asn Pro Gly Asp Pro Arg Glu 

1540 1545 1550 

Ala Gin Glu Gin Gin Asn Asp Gly Lys Gly Arg Gly Asn Lys Ala Ala 

1555 1560 1565 

Lys Arg Asp Leu Pro Pro Ala Lys Thr His Leu He Gin Glu Asp He 
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1570 1575 1580 

Leu Pro Tyr Cys Arg Pro Thr Phe Pro Thr Ser Asn Asn Pro Arg Asp 
1585 1590 1595 1600 

Pro Ser Ser Ser Ser Ser Met Ser Ser Arg Gly Ser Gly Ser Arg Gin 

1605 1610 1615 

Arg Glu Gin Ala Asn Val Gly Arg Arg Asn He Ala Glu Met Gin Val 

1620 1625 1630 

Leu Gly Gly Tyr Glu Arg Gly Glu Asp Asn Asn Glu Glu Leu Glu Glu 

1635 1640 1645 

Thr Glu Ser 
1650 


(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 0 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 
(ii) MOLECULE TYPE: CDNA 
(ix) FEATURE: 

(A) NAME / KEY : misc_f eature 

(B) LOCATION: 855. .1187 

(D) OTHER INFORMATION: /note= "N signifies gap in sequence" 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 


CAGATTGTTG 

CTCAAGGTCG 

AACAGTGACA 

TTTCCCTGTG 

AAACTAAAGG 

AAAC C C AC AG 

60 

CCAGCTGTTT 

TTTGGCAGAA 

AGAAGGCAGC 

CAGAAC CTAC 

TTTTCCCAAA 

CCAACCCCAG 

120 

CAGCCCAACA 

GTAGATGCTC 

AGTGTCACCA 

ACTGGAGACC 

TCACAATCAC 

CAACATTCAA 

180 

CGTTCCGACG 

CGGGTTACTA 

CATCTGCCAG 

GCTTTAACTG 

TGGCAGGAAG 

CATTTTAGCA 

240 

AAAGCTCAAC 

TGGAGGTTAC 

TGATGTTTTG 

ACAGATAGAC 

CTCCACCTAT 

AATTCTACAA 

300 

GGCCCAGCCA 

ACCAAACGCT 

GGCAGTGGAT 

GGTACAGCGT 

TAC TGAAATG 

TAAAGC CACT 

360 

GGTGATCCTC 

TTC CTGTAAT 

TAGCTGGTTA 

AAGGAGGGAT 

TTACTTTTCC 

GGGTAGAGAT 

420 

CCAAGAGCAA 

CAATTCAAGA 

GCAAGGCACA 

CTGCAGATTA 

AGAATTTACG 

GATTTCTGAT 

480 

ACTGGCACTT 

ATACTTGTGT 

GGCTACAAGT 

T CAAGTGG AG 

AGGCTTCCTG 

GAGTGCAGTG 

540 

C TGGATGTG A 

CAGAGTCTGG 

AGCAACAATC 

AGTAAAAACT 

ATGATTTAAG 

TGACCTGCCA 

600 

GGGCCACCAT 

CCAAACCGCA 

AGT CACTGAT 

GTTACTAAGA 

ACAGTGTCAC 

CTTGTCCTGG 

660 

CAGCCAGGTA 

CCCCTGGAAC 

CCTTCCAGCA 

AGTGCATATA 

TCATTGAGGC 

TTTCAGCCAA 

720 

TCAGTGAGCA 

ACAGCTGGCA 

GACCGTGGCA 

AACCATGTAA 

AGACCACCCT 

CTATACTGTA 

780 

AGAGGACTGC 

GGCCCAATAC 

AATCTACTTA 

TTCATGGTCA 

GAGCGAT CAA 

CCCCAAGGTY 

840 
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TCAGTGACCC 

AAGTNAAACC 

ACAGAAAAAC 

AATGGATCCA 

CTTGGGCCAA 

TGTCCCTCTA 

900 

CCTCCCCCCC 

CAGTCCAGCC 

CCTTCCTGGC 

ACGGAGCTGG 

AACACTATGC 

AGTGGAACAA 

960 

CAAGAAAATG 

GCTATGACAG 

TGATAGCTGG 

TGCCCACCAT 

TGCCAGTACA 

AACTTACTTA 

1020 

CACCAAGGTC 

TGGAAGATGA 

ACTGGAAGAA 

GATGATGATA 

GGGTCCCAAC 

ACCTCCTGTT 

1080 

CGAGGCGTGG 

CTTCTTCTCC 

TGCTATCTCC 

TTTGGACAGC 

AGTCCACTGC 

AACTCTTACT 

1140 

CCATCCCGAC 

GGGAAGAGAT 

GCAACCCATG 

CTGCAGGCTT 

CACCTNTTTA 

CCTCCTCTCA 

1200 

AAGACCTCGA 

CCTACCAGCC 

CATTTTCTAC 

TGACAGTAAC 

ACCAGTGCAG 

CCCTGAGTCA 

1260 

AAGTCAGAGG 

CCTCGGCCCA 

CTAAAAAACA 

CAAGGGAGGG 



1300 


(2) INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 434 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 
(ix) FEATURE: 

(A) NAME /KEY : Modified- s ite 

(B) LOCATION: 285.. 396 

(D) OTHER INFORMATION: /note= "Xaa signifies gap in sequence 11 
(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 10: 

Gin lie Val Ala Gin Gly Arg Thr Val Thr Phe Pro Cys Glu Thr Lys 

15 10 15 

Gly Asn Pro Gin Pro Ala Val Phe Trp Gin Lys Glu Gly Ser Gin Asn 

20 25 30 

Leu Leu Phe Pro Asn Gin Pro Gin Gin Pro Asn Ser Arg Cys Ser Val 

35 40 45 

Ser Pro Thr Gly Asp Leu Thr lie Thr Asn lie Gin Arg Ser Asp Ala 

50 55 60 

Gly Tyr Tyr lie Cys Gin Ala Leu Thr Val Ala Gly Ser lie Leu Ala 
65 70 75 80 

Lys Ala Gin Leu Glu Val Thr Asp Val Leu Thr Asp Arg Pro Pro Pro 

85 90 95 

lie lie Leu Gin Gly Pro Ala Asn Gin Thr Leu Ala Val Asp Gly Thr 

100 105 110 

Ala Leu Leu Lys Cys Lys Ala Thr Gly Asp Pro Leu Pro Val lie Ser 

115 120 125 

Trp Leu Lys Glu Gly Phe Thr Phe Pro Gly Arg Asp Pro Arg Ala Thr 
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130 135 140 

lie Gin Glu Gin Gly Thr Leu Gin He Lys Asn Leu Arg He Ser Asp 
145 150 155 160 

Thr Gly Thr Tyr Thr Cys Val Ala Thr Ser Ser Ser Gly Glu Ala Ser 

165 170 175 

Trp Ser Ala Val Leu Asp Val Thr Glu Ser Gly Ala Thr He Ser Lys 

180 185 190 

Asn Tyr Asp Leu Ser Asp Leu Pro Gly Pro Pro Ser Lys Pro Gin Val 

195 200 205 

Thr Asp Val Thr Lys Asn Ser Val Thr Leu Ser Trp Gin Pro Gly Thr 

210 215 220 

Pro Gly Thr Leu Pro Ala Ser Ala Tyr He He Glu Ala Phe Ser Gin 
225 230 235 240 

Ser Val Ser Asn Ser Trp Gin Thr Val Ala Asn His Val Lys Thr Thr 

245 250 255 

Leu Tyr Thr Val Arg Gly Leu Arg Pro Asn Thr He Tyr Leu Phe Met 

260 265 270 

Val Arg Ala He Asn Pro Lys Val Ser Val Thr Gin Xaa Lys Pro Gin 

275 280 285 

Lys Asn Asn Gly Ser Thr Trp Ala Asn Val Pro Leu Pro Pro Pro Pro 

290 295 300 

Val Gin Pro Leu Pro Gly Thr Glu Leu Glu His Tyr Ala Val Glu Gin 
305 310 315 320 

Gin Glu Asn Gly Tyr Asp Ser Asp Ser Trp Cys Pro Pro Leu Pro Val 

325 330 335 

Gin Thr Tyr Leu His Gin Gly Leu Glu Asp Glu Leu Glu Glu Asp Asp 

340 345 350 

Asp Arg Val Pro Thr Pro Pro Val Arg Gly Val Ala Ser Ser Pro Ala 

355 360 365 

He Ser Phe Gly Gin Gin Ser Thr Ala Thr Leu Thr Pro Ser Pro Arg 

370 375 380 

Glu Glu Met Gin Pro Met Leu Gin Ala Ser Pro Xaa Phe Thr Ser Ser 
385 390 395 400 

Gin Arg Pro Arg Pro Thr Ser Pro Phe Ser Thr Asp Ser Asn Thr Ser 

405 410 415 

Ala Ala Leu Ser Gin Ser Gin Arg Pro Arg Pro Thr Lys Lys His Lys 
420 425 430 

Gly Gly 
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(2) INFORMATION FOR SEQ ID NO : 11 : 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 444 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GCCCAGGCAG TTGCTGCAGC TGCGGAGTAT GCGGGCCTGA AAGTGGCTCG CCGCCAAATG 60 
CAAGATGCTG CTGGCCGCCG CCACTTCCAT GCCTCTCAGT GCCCAAGGCC CACGAGTCCT 12 0 

GTGTCCACAG ACAGCAACAT GAGTGCTGTT GTGATCCAGA AAGC CAGACC CGC CAAGAAG 180 
CAGAAACACC AGCCAGGACA TCTGCGCAGG GAAGC CTACG CAGATGATCT TCCACCCCCT 24 0 

CCAGTGCCAC CACCTGCTAT AAAATCGCCC ACTGTCCAGT CCAAGGCACA GCTGGAGGTA 300 
CGGCCTGTCA TGGTGCCAAA ACTCGCGTCT ATAGAAGCAA GGACAGATAG ATCGTCAGAC 3 60 

AGAAAAGGAG GCAGTTACAA GGGGAGAGAA GCTCTGGATG GAAGACAAGT CACTGACCTG 42 0 

CGAACAAATC C AAGTG AC C C CAGA 444 


(2) INFORMATION FOR SEQ ID NO: 12: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 12 : 

Ala Gin Ala Val Ala Ala Ala Ala Glu Tyr Ala Gly Leu Lys Val Ala 

15 10 15 

Arg Arg Gin Met Gin Asp Ala Ala Gly Arg Arg His Phe His Ala Ser 

20 25 30 

Gin Cys Pro Arg Pro Thr Ser Pro Val Ser Thr Asp Ser Asn Met Ser 

35 40 45 

Ala Val Val lie Gin Lys Ala Arg Pro Ala Lys Lys Gin Lys His Gin 

50 , 55 60 

Pro Gly His Leu Arg Arg Glu Ala Tyr Ala Asp Asp Leu Pro Pro Pro 
65 70 75 80 

Pro Val Pro Pro Pro Ala lie Lys Ser Pro Thr Val Gin Ser Lys Ala 
85 90 95 
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Gin Leu Glu Val 
100 

Ala Arg Thr Asp 
115 

Arg Glu Ala Leu 
130 

Ser Asp Pro Arg 
145 


Arg Pro Val Met 

Arg Ser Ser Asp 
120 

Asp Gly Arg Gin 
135 


Val Pro Lys Leu 
105 

Arg Lys Gly Gly 

Val Thr Asp Leu 
14 0 


Ala Ser lie Glu 
110 

Ser Tyr Lys Gly 
125 

Arg Thr Asn Pro 
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WHAT IS CLAIMED IS: 

1 . An isolated Robo polypeptide comprising SEQ ID NO:2, 4, 6, 8, 10 or 12, or a 
polypeptide domain thereof having at least 12 consecutive residues thereof and a Robo- 
specific activity, wherein said domain is encoded by neither EST yq76el2 nor yq76el2. 

2. An isolated polypeptide according to claim 1 , wherein said activity is selected from at 
least one of a Robo-competitive binding, Robo-specific antigenicity and a Robo-specific 
immunogenicity. 

3. An isolated polypeptide according to claim 1, wherein said domain comprises at least 
one of a Robo immunoglobulin, fibronectin or cytoplasmic motif domain. 

4. A recombinant nucleic acid encoding a polypeptide according to claim 1 . 

5. A cell comprising a nucleic acid according to claim 4. 

6. A method of making a Robo polypeptide, comprising the following steps: incubating a 
host cell or cellular extract containing a nucleic acid according to claim 4 under conditions 
whereby the polypeptide encoded by the nucleic acid is expressed and recovering the 
expressed polypeptide. 

7. An isolated Robo polypeptide made by the method of claim 6. 

8. An isolated robo nucleic acid comprising a strand of SEQ ID NO:l, 3, 5, 7, 9 or 11, or 
a fragment thereof having at least 24 consecutive bases thereof, and sufficient to specifically 
hybridize with a nucleic acid having the sequence defined by the corresponding opposite 
strand, wherein the fragment is contained in neither EST yq76el2 nor yq76el2. 

9. A method for modulating cell function or morphology comprising providing the cell with 
an agent which modulates activity of a Robo polypeptide or function of a robo gene, wherein 
the agent comprises a polypeptide according to claim 1 or a Robo-specific antibody. 
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ABSTRACT OF THE DISCLOSURE 
Robol and Robo2 polypeptides may be produced recombinantly from transformed 
host cells from the disclosed Robo encoding nucleic acids or purified from human cells. The 
invention provides isolated Robo hybridization probes and primers capable of specifically 
hybridizing with the disclosed Robo genes, Robo-specific binding agents such as specific 
antibodies, and methods of making and using the subject compositions in diagnosis, therapy 
and in the biopharmaceutical industry. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Goodman, Corey S. 

Kidd, Thomas 
Mitchell, Kevin 
Tear, Guy 

(ii) TITLE OF INVENTION: Robo : A Novel Family of Polypeptide and 
Nucleic Acids 
(iii) NUMBER OF SEQUENCES: 12 
(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SCIENCE & TECHNOLOGY LAW GROUP 

(B) STREET: 75 DENISE DRIVE 

(C) CITY: HILLSBOROUGH 

( D ) S TATE : CAL I FORN I A 

(E) COUNTRY: USA 

(F) ZIP : 94010 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS -DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 
(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 
(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: OSMAN, RICHARD A 

(B) REGISTRATION NUMBER: 36,627 

(C) REFERENCE/DOCKET NUMBER: B98-006 
(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (650) 343-4341 

(B) TELEFAX: (650) 343-4342 

(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 418 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
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(D ) TOPOLOGY : linear 
(ii) MOLECULE TYPE: cDNA 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 


ATGCATC CCA 

TGCATCCCGA 

AAACCACGCC 

ATCGCCCGGA 

GCACGAGCAC 

CACTAATAAC 

60 

CCATCTCGCA 

GTCGGAGCAG 

CAGGATGTGG 

CTCCTGCCCG 

CCTGGCTGCT 

CCTCGTCCTG 

120 

GTGGCCAGCA 

ATGGCCTGCC 

AGCAGTCAGA 

GGCCAGTACC 

AATCGCCACG 

TATCATCGAG 

180 

CATCCCACGG 

ATCTGGTCGT 

TAAGAAGAAT 

GAACCCGCCA 

CGCTCAACTG 

CAAAGTGGAG 

240 

GGCAAGCCGG 

AACCCACCAT 

TGAGTGGTTT 

AAGGATGGCG 

AACCCGTCAG 

CAC C AACGAA 

300 

AAGAAATCGC 

ACCGCGTCCA 

GTTCAAGGAC 

GGCGCCCTCT 

TCTTTTACAG 

GACAATGCAA 

360 

GGCAAGAAGG 

AGCAGGACGG 

CGGAGAGTAC 

TGGTGCGTGG 

CCAAGAACCG 

AGTGGGC CAG 

420 

GCCGTTAGTC 

GCCATGCCTC 

CCTCCAGATA 

GCTGTTTTGC 

GCGACGATTT 

TCGCGTGGAG 

480 

CC CAAAGAC A 

CGCGAGTGGC 

CAAAGGCGAG 

ACGGCTCTGC 

TGGAGTGTGG 

GCCGCCCAAA 

540 

GGCATTCCAG 

AGCCAACGCT 

GATTTGGATA 

AAGGACGGCG 

TTCCCTTGGA 

CGACCTGAAA 

600 

GCCATGTCGT 

TTGGCGCCAG 

CTCCCGCGTT 

CGAATTGTGG 

ACGGTGGCAA 

CCTGCTGATC 

660 

AGCAATGTGG 

AGCCCATTGA 

TGAGGGCAAC 

TACAAGTGCA 

TTGCC CAGAA 

TCTGGTAGGC 

720 

AC C CGCGAGA 

GCAGCTATGC 

CAAGCTGATT 

GTCCAGGTCA 

AACCATACTT 

TATGAAGGAG 

780 

CCCAAGGATC 

AGGTGATGCT 

CTACGGCCAG 

ACAGCCACTT 

TCCACTGCTC 

AGTGGGCGGT 

840 

GATCCGCCGC 

CGAAAGTGTT 

GTGGAAAAAG 

GAGGAGGGCA 

ATATTC CGGT 

GTCCAGAGCG 

900 

CGAATCCTTC 

ACGACGAGAA 

AAGTTTAGAG 

ATATCCAACA 

TAACGCCCAC 

CGATGAGGG C 

960 

ACCTATGTCT 

GCGAGGCACA 

CAACAATGTC 

GGTCAGATCA 

GCGCTAGGGC 

TTCTCTTATA 

1020 

GTCCACGCTC 

CGCCGAACTT 

TACGAAAAGA 

CCCAGTAACA 

AGAAAGTGGG 

ACTAAATGGG 

1080 

GTTGTCCAAC 

TACCTTGCAT 

GGCCTCCGGA 

AACCCTCCGC 

CGTCTGTATT 

CTGGAC CAAG 

1140 

GAAGGAGTAT 

CCACTCTTAT 

GTTCCCAAAT 

AGTTCGCACG 

GAAGGCAGTA 

TGTGGCTGCC 

1200 

GATGGAACTC 

TGCAGATTAC 

GGATGTGCGG 

CAGGAAGACG 

AAGGCTACTA 

TGTGTGTTCC 

1260 

GCTTTCAGTG 

TAGTCGATTC 

CTCTACAGTA 

CGGGTTTTCC 

TGCAAGTCAG 

CTCGGTAGAC 

1320 

GAGCGTCCAC 

CTCCGATTAT 

TCAAATCGGA 

CCTGCCAATC 

AAACACTGCC 

CAAGGGATCA 

1380 

GTTGCTACTT 

TACCCTGTCG 

GGC CACTGGA 

AATCCCAGTC 

CCCGTATCAA 

GTGGTTCCAC 

1440 

GATGGACATG 

CCGTACAAGC 

GGGCAATCGA 

TACAGCATCA 

TCCAAGGAAG 

CTCACTGAGA 

1500 

GTCGATGACC 

TTCAACTAAG 

TGACTCTGGT 

ACCTACACCT 

GCACTGCATC 

TGGCGAACGA 

1560 

GGAGAAACTT 

CCTGGGCTGC 

CACACTAACG 

GTGGAAAAAC 

CCGGTTCTAC 

ATCTCTTCAC 

1620 

CGGGCAGCTG 

AT C CTAGCAC 

TTATCCTGCT 

CCTCCAGGAA 

CAC CTAAAGT 

CCTGAATGTC 

1680 

AGTCGCACCA 

GCATTAGTCT 

TCGTTGGGCT 

AAAAGCCAAG 

AGAAACCCGG 

AGCTGTGGGC 

1740 

CCAATCATTG 

GATACACTGT 

AGAGTACTTC 

AGTCCGGATC 

TGCAAACTGG 

TTGGATTGTG 

1800 

GCTGCCCATC 

GAGTCGGCGA 

CACTCAAGTC 

ACTATCTCGG 

GTCTCACTCC 

TGGCACTTCG 

1860 

TATGTGTTCC 

TAGTTAGAGC 

TGAGAATACT 

CAGGGTATTT 

CTGTGCCTTC 

CGGCTTATCA 

1920 

AATGTTATTA 

AAACCATTGA 

GGCAGATTTC 

GATGCAGCTT 

CTGC CAATGA 

TTTGTCAGCA 

1980 

GCTCGAACTT 

TGCTGACAGG 

AAAGTCGGTG 

GAGCTAATAG 

ATGCCTCGGC 

TATCAATGCT 

2040 

AGTGCCGTTA 

GACTTGAGTG 

GATGCTCCAC 

GTGAGCGCTG 

ATGAGAAATA 

CGTAGAGGGC 

2100 


38 


B98-006 


CTGCGCATAC 

ACTATAAGGA 

TGCCAGTGTA 

CCATCCGCAC 

AGTATCACTC 

GATCACTGTT 

2160 

ATGGATGCCT 

CTGCAGAATC 

GTTTGTGGTG 

GGAAAC CTTA 

AGAAGTACAC 

CAAGTATGAG 

2220 

TTCTTCCTAA 

CACCCTTTTT 

TGAGACAATT 

GAAGGACAGC 

CCAGTAACTC 

CAAGACAGCC 

2280 

CTCACCTATG 

AAGATGTTCC 

CTCCGCACCA 

CCGGATAACA 

TTCAGATTGG 

CATGTACAAC 

2340 

CAAACAGCCG 

GTTGGGTGCG 

TTGGACTC CG 

CCACCCTCCC 

AGCAC CACAA 

TGGCAATTTG 

2400 

TATGGCTACA 

AGATTGAGGT 

CAGCGCCGGT 

AACAC CATGA 

AGGTGCTGGC 

CAATATGACT 

2460 

CTTAATGCTA 

CCACCACATC 

TGTGCTCCTA 

AATAACCTAA 

CCACCGGAGC 

TGTGTACAGC 

2520 

GTGAGGTTGA 

ACTCCTTTAC 

CAAGGCAGGA 

GATGGAC CTT 

ACTCCAAACC 

GATATCACTA 

2580 

TTCATGGACC 

CCACCCATCA 

TGTGCATCCG 

CCACGGGCAC 

ATCCAAGCGG 

CACCCATGAT 

2640 

GGGCGACATG 

AGGGACAGGA 

TCTCACGTAT 

CATAACAATG 

GCAACATACC 

ACCTGGCGAC 

2700 

ATTAATCCCA 

CCACTCATAA 

AAAGACCACT 

GACTAC CTAT 

CTGGACCGTG 

GCTAATGGTG 

2760 

CTGGTCTGCA 

TCGTTCTTCT 

AGTCCTGGTT 

ATTTCGGCGG 

CTATTTCGAT 

GGTCTACTTC 

2820 

AAGCGCAAGC 

ATCAAATGAC 

CAAGGAATTG 

GGTCACTTAA 

GTGTGGTCAG 

TGACAACGAA 

2880 

ATAAC CG CAT 

TAAATATCAA 

TAGCAAAGAG 

AGCCTTTGGA 

TAGACCATCA 

TCGTGGATGG 

2940 

CGAACTGCCG 

ATACTGACAA 

AGACT CAGGA 

TTAAGCGAAT 

CGAAGCTACT 

ATCCCACGTT 

3000 

AACAGCAGTC 

AATCCAACTA 

CAATAACTCC 

GATGGAGGAA 

CCGATTATGC 

AGAAGTTGAC 

3060 

ACCCGTAACC 

TTACCACCTT 

CTACAATTGT 

CGCAAGAGCC 

CCGATAATCC 

CACGCCGTAC 

312 0 

GCCACCACTA 

TGATCATTGG 

TACCTCTTCC 

AGTGAGACCT 

GCACCAAGAC 

AACATCTATA 

3180 

AGTGCCGATA 

AGGACTCGGG 

AACTCATTCG 

CCCTATTCTG 

ACGCATTTGC 

CGGTCAGGTG 

3240 

CCAGCGGTTC 

CTGTTGTCAA 

ATCCAACTAT 

CTTCAGTATC 

CGGTTGAACC 

GATCAACTGG 

3300 

TCAGAGTTTC 

TACCCCCGCC 

GCCAGAACAC 

CCACCTCCGT 

CTTCTACCTA 

TGGATACGCA 

3360 

CAAGGATCTC 

CTGAATCTTC 

GCGGAAGAGC 

TCCAAAAGCG 

CAGGTTCCGG 

CATTTCTACA 

3420 

AATCAAAGCA 

TTCTGAACGC 

ATCCATACAC 

AGCAGCTCCT 

CGGGCGGCTT 

TTCAGCTTGG 

3480 

GGAGTATCGC 

C C CAATATG C 

TGTCGCCTGT 

CCACCGGAAA 

ACGTTTATAG 

CAATCCGCTG 

3540 

TCGGCAGTGG 

CTGGCGGCAC 

CCAGAACCGC 

TAT C AG AT AA 

CGCCCACAAA 

CCAACATCCG 

3600 

CCACAGTTAC 

CGGCCTACTT 

TGCCACCACG 

GGTCCAGGAG 

GAGCTGTACC 

ACCCAACCAC 

3660 

CTGCCATTTG 

CCACACAGCG 

TCATGCAGCC 

AGCGAGTACC 

AGGCTGGACT 

GAATGCAGCG 

3720 

CGATGTGCCC 

AAAGCCGCGC 

CTGCAACAGC 

TGCGATGCCT 

TGGCCACACC 

CTCGCCCATG 

3780 

CAACCCCCAC 

CGCCAGTTCC 

CGTACCCGAG 

GGCTGGTACC 

AACCGGTGCA 

TCCCAATAGC 

3840 

CACCCGATGC 

ACCCGACCTC 

CTCCAACCAC 

CAGATC TAC C 

AGTGCTCCTC 

CGAGTGCTCG 

3900 

GAT C ACT C GA 

GGAGCTCGCA 

GAGTCACAAG 

CGGCAGCTGC 

AG CT CGAGGA 

GCACGGCAGC 

3960 

AGTGCCAAAC 

AACGCGGAGG 

ACACCACCGT 

CGACGAGCCC 

CGGTGGTGCA 

GCCGTGCATG 

4020 

GAGAGCGAGA 

ACGAGAACAT 

GCTGGCGGAG 

TACGAGCAGC 

GCCAGTACAC 

CAGCGATTGC 

4080 

TGCAATAGCT 

CCCGCGAGGG 

CGACACCTGC 

TCCTGCAGCG 

AGGGATC CTG 

TCTTTACGCC 

4140 

GAGGCGGGCG 

AGCCGGCGCC 

TCGTCAAATG 

ACTGCTAAGA 

ACACCTAA 


4188 


(2) INFORMATION FOR SEQ ID NO : 2 : 

( i ) SEQUENCE CHARACTERISTICS : 
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(A) LENGTH: 13 95 amino acids 

(B) TYPE: amino acid 

( C ) S TRANDEDNE S S : s ingl e 

(D) TOPOLOGY : linear 
(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met His Pro Met His Pro Glu Asn His Ala lie Ala Arg Ser Thr Ser 

15 10 15 

Thr Thr Asn Asn Pro Ser Arg Ser Arg Ser Ser Arg Met Trp Leu Leu 

20 25 30 

Pro Ala Trp Leu Leu Leu Val Leu Val Ala Ser Asn Gly Leu Pro Ala 

35 40 45 

Val Arg Gly Gin Tyr Gin Ser Pro Arg lie lie Glu His Pro Thr Asp 

50 55 60 

Leu Val Val Lys Lys Asn Glu Pro Ala Thr Leu Asn Cys Lys Val Glu 
65 70 75 80 

Gly Lys Pro Glu Pro Thr lie Glu Trp Phe Lys Asp Gly Glu Pro Val 

85 90 95 

Ser Thr Asn Glu Lys Lys Ser His Arg Val Gin Phe Lys Asp Gly Ala 

100 105 110 

Leu Phe Phe Tyr Arg Thr Met Gin Gly Lys Lys Glu Gin Asp Gly Gly 

115 120 125 

Glu Tyr Trp Cys Val Ala Lys Asn Arg Val Gly Gin Ala Val Ser Arg 

130 135 140 

His Ala Ser Leu Gin lie Ala Val Leu Arg Asp Asp Phe Arg Val Glu 
145 150 155 160 

Pro Lys Asp Thr Arg Val Ala Lys Gly Glu Thr Ala Leu Leu Glu Cys 

165 170 175 

Gly Pro Pro Lys Gly lie Pro Glu Pro Thr Leu lie Trp lie Lys Asp 

180 185 190 

Gly Val Pro Leu Asp Asp Leu Lys Ala Met Ser Phe Gly Ala Ser Ser 

195 200 205 

Arg Val Arg lie Val Asp Gly Gly Asn Leu Leu lie Ser Asn Val Glu 

210 215 220 

Pro lie Asp Glu Gly Asn Tyr Lys Cys lie Ala Gin Asn Leu Val Gly 
225 230 235 240 

Thr Arg Glu Ser Ser Tyr Ala Lys Leu lie Val Gin Val Lys Pro Tyr 
245 250 255 
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Phe Met Lys Glu Pro 
260 

Thr Phe His Cys Ser 
275 

Lys Lys Glu Glu Gly 
290 

Asp Glu Lys Ser Leu 
305 

Thr Tyr Val Cys Glu 
325 

Ala Ser Leu He Val 
340 

Asn Lys Lys Val Gly 
355 

Ser Gly Asn Pro Pro 
370 

Thr Leu Met Phe Pro 
385 

Asp Gly Thr Leu Gin 
405 

Tyr Val Cys Ser Ala 
420 

Phe Leu Gin Val Ser 
435 

He Gly Pro Ala Asn 
450 

Pro Cys Arg Ala Thr 
465 

Asp Gly His Ala Val 
485 

Ser Ser Leu Arg Val 
500 

Thr Cys Thr Ala Ser 
515 

Leu Thr Val Glu Lys 
530 

Pro Ser Thr Tyr Pro 
545 


Lys Asp Gin Val Met 
265 

Val Gly Gly Asp Pro 
280 

Asn He Pro Val Ser 
295 

Glu He Ser Asn He 
310 

Ala His Asn Asn Val 
330 

His Ala Pro Pro Asn 
345 

Leu Asn Gly Val Val 
360 

Pro Ser Val Phe Trp 
375 

Asn Ser Ser His Gly 
390 

lie Thr Asp Val Arg 
410 

Phe Ser Val Val Asp 
425 

Ser Val Asp Glu Arg 
440 

Gin Thr Leu Pro Lys 
455 

Gly Asn Pro Ser Pro 
470 

Gin Ala Gly Asn Arg 
490 

Asp Asp Leu Gin Leu 
505 

Gly Glu Arg Gly Glu 
520 

Pro Gly Ser Thr Ser 
535 

Ala Pro Pro Gly Thr 
550 
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Leu Tyr Gly Gin Thr 
270 

Pro Pro Lys Val Leu 
285 

Arg Ala Arg He Leu 
300 

Thr Pro Thr Asp Glu 
315 

Gly Gin He Ser Ala 
335 

Phe Thr Lys Arg Pro 
350 

Gin Leu Pro Cys Met 
365 

Thr Lys Glu Gly Val 
380 

Arg Gin Tyr Val Ala 
395 

Gin Glu Asp Glu Gly 
415 

Ser Ser Thr Val Arg 
430 

Pro Pro Pro He He 
445 

Gly Ser Val Ala Thr 
460 

Arg He Lys Trp Phe 
475 

Tyr Ser He He Gin 
495 

Ser Asp Ser Gly Thr 
510 

Thr Ser Trp Ala Ala 
525 

Leu His Arg Ala Ala 
540 

Pro Lys Val Leu Asn 
555 


Ala 

Trp 

His 

Gly 
320 
Arg 

Ser 

Ala 

Ser 

Ala 
400 
Tyr 

Val 

Gin 

Leu 

His 
480 
Gly 

Tyr 

Thr 

Asp 

Val 
560 
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Ser Arg Thr Ser 

Gly Ala Val Gly 
580 

Asp Leu Gin Thr 
595 

Gin Val Thr lie 
610 

Val Arg Ala Glu 
625 

Asn Val lie Lys 

Asp Leu Ser Ala 
660 

lie Asp Ala Ser 
675 

Leu His Val Ser 
690 

Tyr Lys Asp Ala 
705 

Met Asp Ala Ser 

Thr Lys Tyr Glu 
740 

Gin Pro Ser Asn 
755 

Ala Pro Pro Asp 
770 

Trp Val Arg Trp 
785 

Tyr Gly Tyr Lys 

Ala Asn Met Thr 
820 

Leu Thr Thr Gly 
835 

Ala Gly Asp Gly 
850 


lie Ser Leu Arg 
565 

Pro lie lie Gly 

Gly Trp He Val 
600 

Ser Gly Leu Thr 
615 

Asn Thr Gin Gly 
630 

Thr He Glu Ala 
645 

Ala Arg Thr Leu 

Ala He Asn Ala 
680 

Ala Asp Glu Lys 
695 

Ser Val Pro Ser 
710 

Ala Glu Ser Phe 
725 

Phe Phe Leu Thr 

Ser Lys Thr Ala 
760 

Asn He Gin He 
775 

Thr Pro Pro Pro 
790 

He Glu Val Ser 
805 

Leu Asn Ala Thr 

Ala Val Tyr Ser 
840 

Pro Tyr Ser Lys 
855 


Trp Ala Lys Ser 
570 

Tyr Thr Val Glu 
585 

Ala Ala His Arg 

Pro Gly Thr Ser 
620 

He Ser Val Pro 
635 

Asp Phe Asp Ala 
650 

Leu Thr Gly Lys 
665 

Ser Ala Val Arg 

Tyr Val Glu Gly 
700 

Ala Gin Tyr His 
715 

Val Val Gly Asn 
730 

Pro Phe Phe Glu 
745 

Leu Thr Tyr Glu 

Gly Met Tyr Asn 
780 

Ser Gin His His 
795 

Ala Gly Asn Thr 
810 

Thr Thr Ser Val 
825 

Val Arg Leu Asn 

Pro He Ser Leu 
860 


Gin Glu Lys Pro 
575 

Tyr Phe Ser Pro 
590 

Val Gly Asp Thr 
605 

Tyr Val Phe Leu 

Ser Gly Leu Ser 
640 

Ala Ser Ala Asn 
655 

Ser Val Glu Leu 
670 

Leu Glu Trp Met 
685 

Leu Arg He His 

Ser He Thr Val 
720 

Leu Lys Lys Tyr 
735 

Thr He Glu Gly 
750 

Asp Val Pro Ser 
765 

Gin Thr Ala Gly 

Asn Gly Asn Leu 
800 

Met Lys Val Leu 
815 

Leu Leu Asn Asn 
830 

Ser Phe Thr Lys 
845 

Phe Met Asp Pro 
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Thr His His Vai His Pro Pro Arg Ala His Pro Ser Gly Thr His Asp 
865 870 875 880 

Gly Arg His Glu Gly Gin Asp Leu Thr Tyr His Asn Asn Gly Asn lie 

885 890 895 

Pro Pro Gly Asp lie Asn Pro Thr Thr His Lys Lys Thr Thr Asp Tyr 

900 905 910 

Leu Ser Gly Pro Trp Leu Met Val Leu Val Cys lie Val Leu Leu Val 

915 920 925 

Leu Val He Ser Ala Ala He Ser Met Val Tyr Phe Lys Arg Lys His 

930 935 940 

Gin Met Thr Lys Glu Leu Gly His Leu Ser Val Val Ser Asp Asn Glu 
945 950 955 960 

lie Thr Ala Leu Asn He Asn Ser Lys Glu Ser Leu Trp He Asp His 

965 970 975 

His Arg Gly Trp Arg Thr Ala Asp Thr Asp Lys Asp Ser Gly Leu Ser 

980 985 990 

Glu Ser Lys Leu Leu Ser His Val Asn Ser Ser Gin Ser Asn Tyr Asn 

995 1000 1005 

Asn Ser Asp Gly Gly Thr Asp Tyr Ala Glu Val Asp Thr Arg Asn Leu 

1010 1015 1020 

Thr Thr Phe Tyr Asn Cys Arg Lys Ser Pro Asp Asn Pro Thr Pro Tyr 
1025 1030 1035 1040 

Ala Thr Thr Met He He Gly Thr Ser Ser Ser Glu Thr Cys Thr Lys 

1045 1050 1055 

Thr Thr Ser He Ser Ala Asp Lys Asp Ser Gly Thr His Ser Pro Tyr 

1060 1065 1070 

Ser Asp Ala Phe Ala Gly Gin Val Pro Ala Val Pro Val Val Lys Ser 

1075 1080 1085 

Asn Tyr Leu Gin Tyr Pro Val Glu Pro He Asn Trp Ser Glu Phe Leu 

1090 1095 1100 

Pro Pro Pro Pro Glu His Pro Pro Pro Ser Ser Thr Tyr Gly Tyr Ala 
1105 1110 1115 1120 

Gin Gly Ser Pro Glu Ser Ser Arg Lys Ser Ser Lys Ser Ala Gly Ser 

1125 1130 1135 

Gly He Ser Thr Asn Gin Ser He Leu Asn Ala Ser He His Ser Ser 

1140 1145 1150 

Ser Ser Gly Gly Phe Ser Ala Trp Gly Val Ser Pro Gin Tyr Ala Val 
1155 1160 1165 
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Ala Cys Pro Pro Glu Asn Val Tyr Ser Asn Pro Leu Ser Ala Val Ala 

U70 1175 1180 

Gly Gly Thr Gin Asn Arg Tyr Gin lie Thr Pro Thr Asn Gin His Pro 
1185 1190 H95 1200 

Pro Gin Leu Pro Ala Tyr Phe Ala Thr Thr Gly Pro Gly Gly Ala Val 

1205 1210 1215 

Pro Pro Asn His Leu Pro Phe Ala Thr Gin Arg His Ala Ala Ser Glu 

1220 1225 1230 

Tyr Gin Ala Gly Leu Asn Ala Ala Arg Cys Ala Gin Ser Arg Ala Cys 

12 35 1240 1245 

Asn Ser Cys Asp Ala Leu Ala Thr Pro Ser Pro Met Gin Pro Pro Pro 

1250 1255 1260 

Pro Val Pro Val Pro Glu Gly Trp Tyr Gin Pro Val His Pro Asn Ser 

mm 1275 1280 

1265 1270 J-^' 3 

His Pro Met His Pro Thr Ser Ser Asn His Gin He Tyr Gin Cys Ser 

1285 1290 1295 

Ser Glu Cys Ser Asp His Ser Arg Ser Ser Gin Ser His Lys Arg Gin 

1300 1305 1310 

Leu Gin Leu Glu Glu His Gly Ser Ser Ala Lys Gin Arg Gly Gly His 

13 15 1320 1325 

His Arg Arg Arg Ala Pro Val Val Gin Pro Cys Met Glu Ser Glu Asn 

13 30 1335 1340 

Glu Asn Met Leu Ala Glu Tyr Glu Gin Arg Gin Tyr Thr Ser Asp Cys 
1345 1350 1355 1360 

Cys Asn Ser Ser Arg Glu Gly Asp Thr Cys Ser Cys Ser Glu Gly Ser 

136 5 1370 1375 

Cys Leu Tyr Ala Glu Ala Gly Glu Pro Ala Pro Arg Gin Met Thr Ala 
1380 1385 1390 

Lys Asn Thr 
1395 

(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 414 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION": SEQ ID NO : 3 : 


GG TG AAAAT C 

CACGCAl CAT 

CGAGC Al C L. C 

A 1 GGAC ALGA 


7\ 7\ 7\ r T l /~ t 7\ r Vf^'r^ TV 

AAA 1 bA I LLA 

60 

TTTACGTTTA 

ATTGCCAGGC 

CGAGGGCAAT 

CCAACACCAA 

CCATTCAATG 

GTTTAAGGAC 

12 0 

GGTCGCGAAC 

TGAAGACGGA 

TACGGGTTCG 

CATCGCATAA 

TGCTGCCCGC 

CGGGGGTCTA 

18 0 

TTCTTTCTCA 

AGGTTATCCA 

CTCACGTAGA 

GAG AG C GAT G 

y-1/-1 y-1 y— 1 y-~^ TV '"IrTHTlTi 

CGGGCACTTA 

CTGGTGCGAG 

24 0 

GCCAAAAACG 

AGTTTGGAGT 

GGCACGGTCC 

•71 /-i /-» 71 TV m /~1 /— i TV TV 

AGG AAT G C AA 

CGTTGCAAGT 

GGCAGTTCTC 

3 00 

CGCGACGAAT 

TCCGTTTGGA 

GCCGGCAAAT 

ACCCGCGTGG 

CCCAAGGCGA 

y— i y— 1 rn/— 1 y~i y— » y—i nmn 

GGTGGCCCTG 

360 

ATGGAATGCG 

GTGCCCCCCG 

AGGATCTC CG 

GAGCCGCAAA 

TCTCGTGGCG 

j^l Ti TV i^l TV TV jt*^< /~1 

C AAG AAC GG C 

42 0 

CAGACCCTGA 

ATCTTGTCGG 

GAACAAGCGG 

ATTCGCATTG 

TCGACGGTGG 

CAATCTGGCC 

480 

ATCCAGGAAG 

CCCGCCAATC 

GGACGACGGA 

CGCTACCAGT 

GTG TGG T C AA 

GAATGTGGTT 

54 0 

GGCACCCGGG 

AGTCGGCCAC 

CGCTTTTCTT 

AAAGTGCATG 

TACGTC CATT 

CCTCATCCGA 

60 0 

GGACCCCAGA 

ATCAGACGGC 

GGTGGTGGGC 

AGCTCGGTGG 

TCTTCCAGTG 

CCGCATCGGA 

660 

GGCGATCCCC 

TGCCTGATGT 

C CTGTGGCGA 

CGCACTGCCT 

y— f/-|y^*y— 1 y^f y~ly**^ /— 1 71 7\ 

CCGGCGGCAA 

TATGCCACTG 

72 0 

CGTAAGTTTT 

CTTGGCTTCA 

TTCAGCTTCA 

GGTCGTGTGC 

ACGTACTTGA 

GGACCGCAGT 

780 

CTGAAGCTGG 

ACGACGTTAC 

TCTGGAGGAC 

ATGGGCGAGT 

ACACTTGCGA 

GGCGGALAAT 

840 

GCGGTGGGCG 

GCATCACGGC 

CACTGGCATC 

C 1 LALCbTTC 

ACGCTCGCL.C 

CAAATTTbTb 

90 0 

ATACGCCCCA 

"A y"1 TV "A m /~t Tv »-* /~im 

AG AAT GAG C T 

GG TGGAGATC 

GGTGATGAAG 

TGCTGTTCGA 

GTGCCAAGCG 

960 

AATG G AC AT C 

/—I y— t y~ly— I TV y""t/—1 7V TV /~1 

CCCGACCAAC 

GCTCTACTGG 

TCGGTGGAGG 

GCAACAGCTC 

CCTGCTGCTC 

102 0 

CCCGGCTATC 

GGGATGGCCG 

CATGGAAGTG 

AGCCTGACGC 

y— 1 y— 1 y— 1 -jv y-t/*i/-tri /-1 /— ( 

CCGAGGGGCG 

y^f-rt y~ t y~t y~ t rr(y~f /~ii~n/~i 

CTCGGTGCTC 

108 0 

TCGATAGCTC 

GATTTGCCCG 

TGAGGATTCC 

GGAAAGGTGG 

TCACTTG CAA 

y— 1/— 1 y— »y— 1 /- irn/"1 Tv *A /""I 

CGCC CTGAAC 

114 0 

GCCGTGGGCA 

GCGTCAGCAG 

TCGGACTGTG 

GTCAGTGTGG 

ATACGCAATT 

CGAGCTGCCA 

1200 

CCGCCGATTA 

TCGAACAGGG 

GCCCGTGAAT 

CAAACGTTGC 

CCGTTAAATC 

AATTGTGGTT 

1260 

CTGCCATGCC 

GAACTCTGGG 

CACTCCAGTG 

CCACAGGTCT 

CTTGGT AC CT 

GGATGGCATA 

132 0 

CCCATCGATG 

TGCAGGAGCA 

CGAGCGGCGG 

■71 t\ m /^i m m m /~i 

AATCTTTCGG 

ACGCTGGAGC 

/■"1mm TV TV >r-1 y—'l TV 1 t tJ 11 

CTTAAC CATT 

1380 

TCGGATCTTC 

AGCGCCACGA 

GGATGAAGGC 

TTGTACACCT 

GCGTGGCCAG 

CAATCGCAAC 

1440 

t\ t\ tv tv m m 

GGAAAATCCT 

CTTGGAGTGG 

TTACCTTCGT 

CTGGAC AC C C 

C G AC AAATC C 

y—1 tv ti m Ti rn ^ t\ tv /~i 

G AAT AT C AAG 

1500 

TTCTTCAGAG 

CCCCAGAACT 

TTCCACCTAC 

CCAGGGCCGC 

CAGGAAAACC 

GCAAATGGTG 

1560 

GAGAAGGGCG 

AAAATTCGGT 

GACTCTCAGC 

TGGACGAGGA 

GCAACAAGGT 

GGGCGGCTCC 

162 0 

AGTCTGGTGG 

GCTATGTAAT 

CGAGATGTTT 

GGCAAAAACG 

AAACGGATGG 

CTGGGTGGCT 

1680 

GTGGGCACTA 

GGGTGCAAAA 

TACCACGTTT 

ACCCAAACGG 

GTCTGCTGCC 

GGGTGTGAAT 

1740 

TACTTCTTTC 

TAATTCGAGC 

CGAGAACTCC 

CATGGCTTAT 

CACTGCCCAG 

TCCGATGTCG 

1800 

GAACCCATTA 

CGGTGGGAAC 

GCGCTACTTC 

AATAGTGGTC 

TGGATCTGAG 

CGAGGCTCGT 

1860 

GCCAGTCTGC 

TGTCCGGAGA 

TGTTGTGGAG 

CTGAGCAACG 

CCAGTGTGGT 

GGACTCCACT 

1920 

AGCATGAAAC 

TCACCTGGCA 

GATCATCAAT 

GGCAAATACG 

TCGAGGGCTT 

CTATGTCTAT 

1980 

GCGAGACAGT 

TGCCAAATCC 

AATAGTCAAC 

AATCCGGCGC 

CCGTTACTAG 

CAATACCAAT 

2040 

CCGCTGCTGG 

GCTCTACATC 

CACATCCGCA 

TCCGCATCCG 

CCTCGGCATC 

GGCATTGATT 

2100 

TCGACAAAGC 

CAAATATTGC 

AGCTGCCGGC 

AAACGTGATG 

GGGAGACAAA 

CCAGAGTGGA 

2160 

GGAGGAGCTC 

CGACCCCACT 

GAACACCAAG 

TATCGCATGC 

TAACGATTCT 

CAATGGCGGT 

2220 
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GGCGCCTCAT 

CCTGCAC CAT 

CACCGGGCTC 

GTCCAGTACA 

CGCTGTATGA 

ATTTTTCATC 

2280 

GTGCCATTTT 

ACAAATCCGT 

CGAGGGCAAG 

CCGTCGAATT 

CGCGCATCGC 

TCGCACCCTT 

234 0 

GAAGATGTTC 

CCTCTGAGGC 

AC CATATGGA 

ATGGAGGCTC 

TGCTGTTGAA 

CTCCTCCGCG 

24 0 0 

GTCTTCCTCA 

VJ X *W X J- V— V_# J- 

AATGGAAGGC 

ACCAGAACTC 

AAGGATCGGC 

ATGGTGTTCT 

CTTGAACTAT 

w^ X X VjXiiTk.'w X X 

24 60 


TPPGAGGTAT 

X V— V— ■ UflUU X X 

TGAPAPTGCP 

x vjjnv_jn\_. x vj > — • v_ 

PAPAATTTPT 

_jriv*riri xxx v— x 

PAPGPATTTT 

V*rt.wVJ_flX xxx 

GACAAATGTP 

vjjr^v_jT^t^jri. x vj x v— 

2 52 0 

.iTi.V_ V_ JTY X V— VJJTTl. X Vj 

PPGPTTCGCP 

V_» V_ VJ V_. X X V—VJV-. V_ 

TAPTPTGGTT 

X -Ci w. X V— X VJVJ X X 

TTGGPPAATC 

TCACCGAAGG 

X V_ — ^1. > — . V_- VJrniVJ V_J 

CGTCATGTAC 

V^ VJ X V-xx X VJ X liV* 

2 5 8 0 

jri.v_ v_vj X vJvjvJv_vj 

TPPPPPPPPP 

A AATAAPGPT 

GGAGTTGGTP 

VJvJz-lvJ X X vjvj X v_ 

PTTATTPTPT 

V— X X jrt. X X VJ X \J X 

PPPAGPTAPT 

v— v— jTi-Vj v_ X jt-1.v_. X 

2 64 0 

t tp p pt tt pp 

X Xv_7V_vjX -L XvjVj 

ATPPPATPAP 

AAAGPGAPTP 

GATPPGTTPA 

T C AAT PAG PG 

GGAP CATGTT 

wvj^iv v-*.^i x vj x x 

2 7 0 0 

nn^uri X vj X \j v_» 

TPAPGPAGCP 

CTGGTTCATA 

V^ X W\J X X w X ,T"1 

ATAPTPCTGG 

GCGCPATCPT 

GGCCGTTCTT 

Www WW X X V^ X X 

2 7 60 

ATGPTGTPPT 

rii. vjv— X vJ X v_v_, X 

TTGGCGCAAT 

GGTPTTTGTP 

VJVJ X V_ XXX X Vj 

AAGPGCAAGC 

flrivJv^vJvv^lflvjv* 

AC AT G ATGAT 

G AAG C AGT C G 

vjJr^rl.sj v — jr^kVJ X V— vj 

2 8 2 0 

GPPPTAAATA 

PAATGCGTGG 

^-rin x vj \_* vj x 

CAATCACAPG 

AGPGAPGTGC 

TCAAAATGCC 

GAGTCTATPG 

Wfl.WT X V-» X xi X W* WT 

2 8 8 0 

PPPPPPAATP 

GAAAPPGPTA 

PTGGPTGGAP 

V*# X W* X VJ vj*a v-- 

TCCTPCAPCG 

X \v \^ X V— w*lw V- v_J 

GCGGAATGGT 

w ww wrin x w x 

GTGGCGTCPC 

VJ X WW WW X www 

2 94 0 

X v^vjv_»\^v_,Vj/vjV_. Vj 

PPPAPTPGPT 

GGAPATPPAA 

vjvj.rt.vj.iT. x vJV>rlri 

AAPGATPAPA 

TPGCPGAPTA 

TPPGPPPPTP 

X VJ V_ VJ V V_ VJVJ X v_ 

3 00 0 

x vjv_,vjvj x vjv„.v_,v-. 

PPGGTTPTPP 

v_» VJVJ X X V_ X V_- V_* 

GGCPGGCGGT 

VJ\J\— V*VJvJv-VJvJ X 

GGPAPPTPTT 

vjvj v^r^.v-.v^ x v— X X 

PCGGTGPATP 

V** Wi VJVJ X w\Jxl X V-- 

PGGTGGPGPG 

v_ VJVJ X VJVJ V_VJ V_» VJ 

3 060 

GGPAGPGGTP 

iJVJ>^jT-VVJ V_. VJVJ x vj 

PPAGCGGCGG 

PGATGAPATT 

X \jriwjrl X X 

PATGGAGGAP 

ACG GPAG CGA 

ilV VJ W VdrJT^VJ V» Wil 

ACGPAATPAG 

r^wwwfui^ WjTiw 

312 0 

CAGCGGTACG 

TGGGCGAGTA 

CTC C AAC AT A 

CCGACCGACT 

ATGCAGAGGT 

GTC CAGTTTT 

318 0 

GGCAAGGCAC 

\J J V_* i^-T^Wf WJ V-*_Tj- V-* 

CCAGCGAGTA 

TGGTCGGCAT 

GGCAACGCCT 

CCCCGGCCCC 

TTATGCCACC 

3 24 0 

TCTTCGATCC 

TGAGTCCCCA 

CCAGCAGCAA 

CAGCAGCAGC 

AGCCGCGTTA 

TCAACAGCGA 

3300 

PCAGTPPPPP 

V_, V—JTIVJ X VJ V_ \_< V_ VJ 

GPTATGGGCT 

VJ \_ J. -FT. X VJVJ VJ V_, X 

CCAGCGCCCA 

v_, wriwwVjv^ v-* v_-.ct. 

ATGPACCPAP 

ACTACCAGCA 

n V-- ± /I w- wrrvr w>r^> 

GCAGCAGCAT 

wwr^wwiT.w wri x 

3 3 60 


APPPPPAGPA 

PAPPPAPPAP 

PAAPAPPAPG 

PTPTPPAGPA 

L J. v.1 v*V*rt\Jv*fl. 

GP APPAPP AA 

vj V-.jrlv_ v_.jri.vj v_jrVrt. 

3 42 0 

V— X VJ\y v_,j!-i.V_. v_- ^.jtA. 

PPAAPATPTA 

UUriflLrl X v^, x x~i 

PPAGPAGA TP 

v_ v_.TTttjv-.jri.vj.rt. x vj 

TPPAPPAPPA 

GPGAGATATA 

VJ V_. VjXT-VJ^l X JTTL X Jr^. 

PPPPAPGAAP 

34 8 0 

a rnnnTr ptt 

PPPPtPTPTGT 

LULvJL X V— X \J X 

PTAPTPTPAP 
v— X .tt.v— X v_, X Urlu 

P APT ATT APT 

V_,jTVvJ X jTi. X X-n^v— X 

APPPPAAPPA 

PA APP A PAP A 
v— i~i.fj.VJ v_ j!^vjjrlvjjr-l. 

^ ^ a n 

J -J *± u 

PAPATPPArA 

v_^.r-iv_jrl. X \_* v_.rtv_.iri. 

TPAPPGAGAA 

PAAGPTGAGP 

AAPTGPPAPA 

PCTATGAGGP 

v_. v_ x jrt. x \Jjj-i.\J vj v_ 

GGPTPPTPPP 

vjvj v_. X V—v_ X vjvjv_- 

3 6 0 0 

PPPA APtPAPtT 
VjV„V_..ri-r^VJV^.ri\J X 

PPTPGPPG AT 

v_ v_ x v_, vj v— v_, vjrr x 

ATPPTPGPAG 

rl X V_ V_. X V— VJ V— jTivJ 

TTCGCCAGCG 

X X V_ VJ V_. V_.JTT.VJ V_ VJ 

TGAGGCGGPA 

X W jTI w W W\JW wii 

GCAGCTGCCG 

VJ W-CT.VT w X ww W VJ 

3660 

PPPAAPTGPA 

GPATPGGPAG 

GG A A AGTGPP 

vj VJ-TTJ-VTiVJ X VJ V_r \_ 

PGPTTPAAGG 

wVj v* x X v,rVT.vJVJ 

TGPTAAACAC 

x vj <w- innriwiriw 

GGATPAGGGC 

vj vjjn. x v^iivj v_i *vj v — 

3 72 0 

A AHA APPAPP 

.tA_^vJ.rtJ--i.\_. Uriu v_ 

APAATPTPPT 

jri.'OjrijrA. X v_ X v_ v_, X 

PPATPTPPAP 

vJVJjrl. X \— . X V*Vxrt.L 

PPPTPPTPGA 

VJ\J\_ X v_v_, X v_ vjj*-i. 

TGTGPTAPAA 

X vj X vJV_, X j^Ljrijri. 

PGPTPTGGPA 

■wVJVj IV/ 1 OVJ V_jTT. 

3 7 8 0 

GAPTPGGGPT 

GPGGTGGATP 

vj v_, vjvj x vjvjn x v — ■ 

TPCPTPPPPG 

X V— V_ V — X Vj V_» V— V_» Vj 

ATGGPPATGP 

Jri X vjvj v_ v_. jri. x v_r v_. 

TGATGTPGPA 

X VJXT. X VJ X V*- VJ V^-xj- 

PGAGGAPGAG 

V_* VJ/-1VJ VJjn.\_, VJ^-iVJ 

3 84 0 

p a pp p p p tp,t 


pp a Tannn-j^ t 

vjvjjri. X U\J vjvjjri. X 

PTPPAPPAPA 

TPP A A PPA PT 

X U Ufifiv* vjriv, X 

PT APPTPA AP 
vj X jf-i. v_ vj X v_ jri-rivj 

^ Q n n 

rypPP A PP A PP 

VJ X vJVJ>ilw,Vj.r"i.vj\_> 

APPAPPPTPP 

i^\J v_i-i.\J V_ v_- X V_ V_ 

AP APPAPP AP 

P APP APPTPA 

TTPPPPTPPT 

X X L v_\_.\_ X vjvj X 

PPP A P APP AT 
v_ v_ v_jri.v_i-i.vj LjHl X 


CCGGCGGAAG 

GTCACCTGCA 

GTCCTGGCGG 

AATCAGAGCA 

CGCGGAGCAG 

TCGGAAGAAC 

4020 

GGCCAGGAAT 

GCATCAAGGA 

ACCCAGCGAG 

TTGATCTACG 

CTCCGGGAAG 

CGTGGCCAGC 

4080 

GAACGGAGCC 

TCCTCAGCAA 

CTCGGGTAGC 

GGCACCAGCA 

GCCAGCCAGC 

TGGCCACAAT 

4140 

GTCTGA 






4146 


(2) INFORMATION FOR SEQ ID NO ; 4 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 13 81 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

Gly Glu Asn Pro Arg lie lie Glu His Pro Met Asp Thr Thr Val Pro 

15 10 15 

Lys Asn Asp Pro Phe Thr Phe Asn Cys Gin Ala Glu Gly Asn Pro Thr 

20 25 30 

Pro Thr lie Gin Trp Phe Lys Asp Gly Arg Glu Leu Lys Thr Asp Thr 

35 40 45 

Gly Ser His Arg lie Met Leu Pro Ala Gly Gly Leu Phe Phe Leu Lys 

50 55 60 

Val lie His Ser Arg Arg Glu Ser Asp Ala Gly Thr Tyr Trp Cys Glu 
65 70 75 80 

Ala Lys Asn Glu Phe Gly Val Ala Arg Ser Arg Asn Ala Thr Leu Gin 

85 90 95 

Val Ala Val Leu Arg Asp Glu Phe Arg Leu Glu Pro Ala Asn Thr Arg 

100 105 110 

Val Ala Gin Gly Glu Val Ala Leu Met Glu Cys Gly Ala Pro Arg Gly 

115 120 125 

Ser Pro Glu Pro Gin He Ser Trp Arg Lys Asn Gly Gin Thr Leu Asn 

130 135 140 

Leu Val Gly Asn Lys Arg He Arg He Val Asp Gly Gly Asn Leu Ala 
145 150 155 160 

He Gin Glu Ala Arg Gin Ser Asp Asp Gly Arg Tyr Gin Cys Val Val 

165 170 175 

Lys Asn Val Val Gly Thr Arg Glu Ser Ala Thr Ala Phe Leu Lys Val 

180 185 190 

His Val Arg Pro Phe Leu He Arg Gly Pro Gin Asn Gin Thr Ala Val 

195 200 205 

Val Gly Ser Ser Val Val Phe Gin Cys Arg He Gly Gly Asp Pro Leu 

210 215 220 

Pro Asp Val Leu Trp Arg Arg Thr Ala Ser Gly Gly Asn Met Pro Leu 
225 230 235 240 

Arg Lys Phe Ser Trp Leu His Ser Ala Ser Gly Arg Val His Val Leu 

245 250 255 

Glu Asp Arg Ser Leu Lys Leu Asp Asp Val Thr Leu Glu Asp Met Gly 
260 265 270 
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Glu Tyr Thr Cys Glu 
275 

Gly lie Leu Thr Val 
290 

Asn Gin Leu Val Glu 
305 

Asn Gly His Pro Arg 
325 

Ser Leu Leu Leu Pro 
340 

Thr Pro Glu Gly Arg 
355 

Asp Ser Gly Lys Val 
370 

Val Ser Ser Arg Thr 
385 

Pro Pro lie lie Glu 
405 

Ser lie Val Val Leu 
420 

Val Ser Trp Tyr Leu 
435 

Arg Arg Asn Leu Ser 
450 

Arg His Glu Asp Glu 
465 

Gly Lys Ser Ser Trp 
485 

Pro Asn lie Lys Phe 
500 

Pro Pro Gly Lys Pro 
515 

Leu Ser Trp Thr Arg 
530 

Tyr Val He Glu Met 
545 

Val Gly Thr Arg Val 
565 


Ala Asp Asn Ala Val 
280 

His Ala Pro Pro Lys 
295 

He Gly Asp Glu Val 
310 

Pro Thr Leu Tyr Trp 
330 

Gly Tyr Arg Asp Gly 
345 

Ser Val Leu Ser He 
360 

Val Thr Cys Asn Ala 
375 

Val Val Ser Val Asp 
390 

Gin Gly Pro Val Asn 
410 

Pro Cys Arg Thr Leu 
425 

Asp Gly He Pro He 
440 

Asp Ala Gly Ala Leu 
455 

Gly Leu Tyr Thr Cys 
470 

Ser Gly Tyr Leu Arg 
490 

Phe Arg Ala Pro Glu 
505 

Gin Met Val Glu Lys 
520 

Ser Asn Lys Val Gly 
535 

Phe Gly Lys Asn Glu 
550 

Gin Asn Thr Thr Phe 
570 
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Gly Gly He Thr Ala 
285 

Phe Val He Arg Pro 
300 

Leu Phe Glu Cys Gin 
315 

Ser Val Glu Gly Asn 
335 

Arg Met Glu Val Thr 
350 

Ala Arg Phe Ala Arg 
365 

Leu Asn Ala Val Gly 
380 

Thr Gin Phe Glu Leu 
395 

Gin Thr Leu Pro Val 
415 

Gly Thr Pro Val Pro 
430 

Asp Val Gin Glu His 
445 

Thr He Ser Asp Leu 
460 

Val Ala Ser Asn Arg 
475 

Leu Asp Thr Pro Thr 
495 

Leu Ser Thr Tyr Pro 
510 

Gly Glu Asn Ser Val 
525 

Gly Ser Ser Leu Val 
540 

Thr Asp Gly Trp Val 
555 

Thr Gin Thr Gly Leu 
575 


Thr 

Lys 

Ala 
320 
Ser 

Leu 

Glu 

Ser 

Pro 
400 
Lys 

Gin 

Glu 

Gin 

Asn 
480 
Asn 

Gly 

Thr 

Gly 

Ala 
560 
Leu 

B98-006 


Pro Gly Val Asn Tyr 
580 

Leu Ser Leu Pro Ser 
595 

Tyr Phe Asn Ser Gly 
610 

Ser Gly Asp Val Val 
625 

Ser Met Lys Leu Thr 
645 

Phe Tyr Val Tyr Ala 
660 

Ala Pro Val Thr Ser 
675 

Ser Ala Ser Ala Ser 
690 

Asn lie Ala Ala Ala 
705 

Gly Gly Ala Pro Thr 
725 

Leu Asn Gly Gly Gly 
740 

Tyr Thr Leu Tyr Glu 
755 

Gly Lys Pro Ser Asn 
770 

Ser Glu Ala Pro Tyr 
785 

Val Phe Leu Lys Trp 
805 

Leu Leu Asn Tyr His 
820 

Phe Ser Arg lie Leu 
835 

Leu Val Leu Ala Asn 
850 

Ala Ala Gly Asn Asn 
865 


Phe Phe Leu lie Arg 
585 

Pro Met Ser Glu Pro 
600 

Leu Asp Leu Ser Glu 
615 

Glu Leu Ser Asn Ala 
630 

Trp Gin lie lie Asn 
650 

Arg Gin Leu Pro Asn 
665 

Asn Thr Asn Pro Leu 
680 

Ala Ser Ala Ser Ala 
695 

Gly Lys Arg Asp Gly 
710 

Pro Leu Asn Thr Lys 
730 

Ala Ser Ser Cys Thr 
745 

Phe Phe lie Val Pro 
760 

Ser Arg lie Ala Arg 
775 

Gly Met Glu Ala Leu 
790 

Lys Ala Pro Glu Leu 
810 

Val lie Val Arg Gly 
825 

Thr Asn Val Thr lie 
840 

Leu Thr Glu Gly Val 
855 

Ala Gly Val Gly Pro 
870 
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Ala Glu Asn Ser His 
590 

lie Thr Val Gly Thr 
605 

Ala Arg Ala Ser Leu 
620 

Ser Val Val Asp Ser 
635 

Gly Lys Tyr Val Glu 
655 

Pro lie Val Asn Asn 
670 

Leu Gly Ser Thr Ser 
685 

Leu lie Ser Thr Lys 
700 

Glu Thr Asn Gin Ser 
715 

Tyr Arg Met Leu Thr 
735 

lie Thr Gly Leu Val 
750 

Phe Tyr Lys Ser Val 
765 

Thr Leu Glu Asp Val 
780 

Leu Leu Asn Ser Ser 
795 

Lys Asp Arg His Gly 
815 

lie Asp Thr Ala His 
830 

Asp Ala Ala Ser Pro 
845 

Met Tyr Thr Val Gly 
860 

Tyr Cys Val Pro Ala 
875 


Gly 

Arg 

Leu 

Thr 
640 
Gly 

Pro 

Thr 

Pro 

Gly 
720 
He 

Gin 

Glu 

Pro 

Ala 
800 
Val 

Asn 

Thr 

Val 

Thr 
880 
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Leu Arg Leu Asp Pro lie Thr Lys Arg Leu Asp Pro Phe lie Asn Gin 

885 890 895 

Arg Asp His Val Asn Asp Val Leu Thr Gin Pro Trp Phe lie lie Leu 

900 905 910 

Leu Gly Ala lie Leu Ala Val Leu Met Leu Ser Phe Gly Ala Met Val 

915 920 925 

Phe Val Lys Arg Lys His Met Met Met Lys Gin Ser Ala Leu Asn Thr 

930 935 940 

Met Arg Gly Asn His Thr Ser Asp Val Leu Lys Met Pro Ser Leu Ser 
945 950 955 960 

Ala Arg Asn Gly Asn Gly Tyr Trp Leu Asp Ser Ser Thr Gly Gly Met 

965 970 975 

Val Trp Arg Pro Ser Pro Gly Gly Asp Ser Leu Glu Met Gin Lys Asp 

980 985 990 

His lie Ala Asp Tyr Ala Pro Val Cys Gly Ala Pro Gly Ser Pro Ala 

995 1000 1005 

Gly Gly Gly Thr Ser Ser Gly Gly Ser Gly Gly Ala Gly Ser Gly Ala 

1010 1015 1020 

Ser Gly Gly Asp Asp He His Gly Gly His Gly Ser Glu Arg Asn Gin 
1025 1030 1035 1040 

Gin Arg Tyr Val Gly Glu Tyr Ser Asn He Pro Thr Asp Tyr Ala Glu 

1045 1050 1055 

Val Ser Ser Phe Gly Lys Ala Pro Ser Glu Tyr Gly Arg His Gly Asn 

1060 1065 1070 

Ala Ser Pro Ala Pro Tyr Ala Thr Ser Ser He Leu Ser Pro His Gin 

1075 1080 1085 

Gin Gin Gin Gin Gin Gin Pro Arg Tyr Gin Gin Arg Pro Val Pro Gly 

1090 1095 1100 

Tyr Gly Leu Gin Arg Pro Met His Pro His Tyr Gin Gin Gin Gin His 
1105 1110 1115 1120 

Gin Gin Gin Gin Ala Gin Gin Thr His Gin Gin His Gin Ala Leu Gin 

1125 1130 1135 

Gin His Gin Gin Leu Pro Pro Ser Asn He Tyr Gin Gin Met Ser Thr 

1140 1145 1150 

Thr Ser Glu He Tyr Pro Thr Asn Thr Gly Pro Ser Arg Ser Val Tyr 

1155 1160 1165 

Ser Glu Gin Tyr Tyr Tyr Pro Lys Asp Lys Gin Arg His He His He 
1170 1175 1180 
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Thr Glu Asn Lys Leu Ser Asn Cys His Thr Tyr Glu Ala Ala Pro Gly 
1185 H90 H95 1200 

Ala Lys Gin Ser Ser Pro He Ser Ser Gin Phe Ala Ser Val Arg Arg 

1205 1210 1215 

Gin Gin Leu Pro Pro Asn Cys Ser He Gly Arg Glu Ser Ala Arg Phe 

1220 1225 1230 

Lys Val Leu Asn Thr Asp Gin Gly Lys Asn Gin Gin Asn Leu Leu Asp 

1235 1240 1245 

Leu Asp Gly Ser Ser Met Cys Tyr Asn Gly Leu Ala Asp Ser Gly Cys 

1250 1255 1260 

Gly Gly Ser Pro Ser Pro Met Ala Met Leu Met Ser His Glu Asp Glu 
1255 1270 1275 1280 

His Ala Leu Tyr His Thr Ala Asp Gly Asp Leu Asp Asp Met Glu Arg 

1285 1290 1295 

Leu Tyr Val Lys Val Asp Glu Gin Gin Pro Pro Gin Gin Gin Gin Gin 

1300 1305 1310 

Leu He Pro Leu Val Pro Gin His Pro Ala Glu Gly His Leu Gin Ser 

1315 1320 1325 

Trp Arg Asn Gin Ser Thr Arg Ser Ser Arg Lys Asn Gly Gin Glu Cys 

1330 1335 1340 

He Lys Glu Pro Ser Glu Leu He Tyr Ala Pro Gly Ser Val Ala Ser 
1345 1350 1355 1360 

Glu Arg Ser Leu Leu Ser Asn Ser Gly Ser Gly Thr Ser Ser Gin Pro 
1365 1370 1375 

Ala Gly His Asn Val 
1380 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 8 94 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
ATGTACTATC TAGGTTTTTA CCACACTCAC ACACACACAC ACACATACAT AAATTTTGAT 
AAAATTCCTA ATGCCTCAAA TCTCGCTCCC GTGATAATCG AACATCCCAT CGATGTGGTG 120 
GTATCTAGGG GATCGCCAGC AACCCTCAAC TGTGGTGCAA AGCCATCTAC CGCCAAAATC 180 


60 
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ACATGGTACA 

AGGATGGACA 

ft ft ft ft ft m tv tv mft 

GCCCGTAATC 

tv fy /"i "A tv rn tv tv /~i 

AC GAATAAGG 

AG C AAGTGAA 

ft TV HPPTV ft ft ft ft 

CAGC CAC CGG 

24 0 

7\ iiurv »i 1 11 i i^ir i tftft 

ATTGTTCTCG 

AC AC G GGA T C 

CCTGTTTCTT 

ft mft T\ tv A /"* rp /— I Tv 

CTGAAAG I GA 

7\ rpTv ft rnft ft y\ tv tv 

hi AG TGG AAA 


3 00 

ft A ft A ft ft ft A mft 

GACAGCGATG 

ft ft ft ft A /"* /— rn 7\ 

CGGGAGCGTA 

CTATTGTGTG 

GC C AG C AAC G 

7V f* f*7^ fff*f*7\ f*7\ 

AG C AC GGAGA 

A /^rp^-i tv t\ ftrpftfi 

AG 1 GAAG 1 CG 

"3 f f\ 

A 7\ ft ft 7\ TV ^-1 TV m 

AACGAAGGAT 

/~t/-t rprp A AAA mm 

CG1TAAAAI 1 

ft ft ft ft t\ rp/~i/-irT tri i 
GGCGAlGCl 1 

ft ft ft ft 7\ A ft A r*rp 

CGCGAAGAC 1 

1 Til I \f*f1 7V fit 111 1 'ft f* 

1 1 LbA(j 1 i CG 

ft ft ft 7\ TV ft tv A A 

/ion 

Hmmni\ ft ft ftmft 

GTTCAGGCTC 

mm ft ft mft ft a ft a 

TTGGTGGAGA 

ft tv mft ft ft f* ft rnrn 

GATGGCCG1 T 

/-! rp /-i 1^1 7\ tv rp/~l /~i 71 

C I GGAA1 GCA 

GI CCGCCACG 

rp/-i/-i tv 1 1111 \ ft ft ft ft 

lGGAi J.CCCC 

4 80 

GAGCCGGTTG 

TGAGCTGGCG 

TV TV T\ /"I TV /*1 /~1 "TV 

GAAAGACGAC 

t\ tv t\ ft tv /^trn ft ft 

AAAGAG CT C C 

ft TV tv rprp/^i TV 7\/-i7V 

GAAT TCAAGA 

ft a nn ft ft ft A /~»Oi 7\ 

C A I GC C AC G A 

540 

TACACTCTAC 

ACTCTGACGG 

AAACCTCATC 

ATTGATCCGG 

TCGATCGAAG 

ft f» A r-p | -p/'-""-P/'"*/-*i rp 

CGAT TC TGGjL 

60 0 

ACTTATCAG T 

GTGTTGCCAA 

CAACATGGTC 

ft ft TV TV TV ft ft ft ft 

GGAGAACGGG 

TT<Ti/^r( tv tv rp ft ft 

TGTCCAAI CC 

pfiriTi A ft A rpTiP 

CGCAAGA 1 1 G 

f f C\ 

0 6 U 

AGTGTCTTTG 

AGAAACCAAA 

GTTTGAGCAA 

ft TV TV ft ft ft TV TV ft ft 

GAACC CAAGG 

7\ /-* 71 rnft 1\ ftftftrrt 

ACATGACGGT 

f* ft 7\ f^ftn-tftflft'r. 

CGACG T C GGA 

72 0 

GCCGCAGTGC 

TG TTTG AT TG 

I CGTG1 GAC 1 

GGAGA ILLlt 

AAC C AC AAA i 

rp TV PPifppTV A 7V 

■7 0 0 

nnnTv a a 7\ a rp/~< 

CGCAAAAATG 

7\ ft fx ft ft T\rriftftft 

AGCCGA1 GCC 

7\ r~ii i 11 1 '7\ /~i 7\ f*f**~n 
AG 1 J. ACACLr 1 

CjCAIACAJ. ib 


1 L.LjvtviLjIj lib 


AG AAT C G AAA 

ft 7\ i u 1 1/— 1 7\ tv ft ft 

GAGTTCAACC 

A Ttft A f* A TV A 

ATCAGACGAA 

r-«/-irp/-i TV 7\ rpTv 1^*1^ 
<JjCj I (jAATACG 

TTtrrtnnft ftrnry rrtft ft 
1 1 ILiC lAlb/C 

A /"'OA A 7\rp/-ipT\ 

ALuflMi CCil 

q n n 

yuu 

ft ft ft /"i/"! 7\ A f*Tp 

GCGGGAACTC 

rprnriTi TV ft ft TV Pfl/*i 

TTGAAGCATC 

rpy-i ft tv /—i tv m/imm 

TGCACAT C 1T 

CVjT gtccagg 

ftf\ ft ft it* ft ft 7\rn ft 
CACC 1CCA1C 

Ornrp/-i^7v tv tv 

y bu 

AAAC C AG C AG 

AC C AGT C AGT 

mft ft-7\ ft ftrrtftft tv 

TCCAGCTGGA 

y-1<*-t/-1TV ft ft ft ft T\ 7\ 

GGCACGGCAA 

Li 111 Vj AA X ij 

CACC I iGLriC 

I a 0 n 
1U2U 

GGTCAACCGA 

/T rn ft ft ft ft ftftrn tv 

GTCCCGCCTA 

TTTTTGGAGC 

tv T\nni\ 7\ ft ft ft ft 

AAGGAAGGC C 

AAC AGGAT CT 

TCTTTTCCCA 

1 r\ 0 a 
10 80 

AGTTATGTGT 

/—i <— 1<-~1 /-irp^-1 A rnft ft 

CCGCTGATGG 

T AG AAC G AAA 

Gill CACCAA 

C 1 VjLriiACA. i 1 

nT\n7\7i rrtrpft 71 /-f 

uALAA 1 J. 

XX4 If 

GAAGTTCGTC 

tv tv ft mm/~i 7\ mrtT, 

AAGT TGATGA 

ft ft ft 7\ ft ftrrsrri A rn 

GG GAG C TTAT 

ftr-rtft rnft ft ft ft rn ft 

GTGTGCGCTG 

ft t\ A rp/-i tv tv ftrrtft 

G AATGAAC T C 

ft ft ft TV ft ft TV AC~i/™1 

GG CAGGAAG C 

1200 

m nun /^i tv >i j** tv 

TCGTTGAGCA 

AGGCAGCTTT 

ft "A TV ft ft TV TV ft TV 

G AAAG C AAC A 

mmirin t\ tv tv ft ft tv 

TTTGAAAC ca 

7v tv ft ft ft ft ft rp ft rn 

AAGGCCGTGT 

flft 7V A A A A A A A 

CCAAAAAAAA 

12 60 

AAGAGCAAAA 

rri r*t tv tv tv tv 

TGGGCAAACA 

/™t TV TV TV /"tTA 71 TV Ti TV 

GAAACAAAAA 

tv tv m/""ti i n 1 iy~*i 71 tv rn 

AATGTTCAAT 

/—I 71 TV mmT* m^tTV TV 

CAATTATCAA 

ATATTTAATT 

132 0 

TCAGCCGTGA 

CCGGAAACAC 

ACCCGCCAAA 

z^t /~i tv in/^Ti y^i/^i tv Tv 

CCAC CACCAA 

fi tv a m/-i/""i tv rin 

C AAT C GAG C A 

rnrt/^m/ ln i tv m/™t a tv 

TGGTCATCAA 

13 80 

AATCAGACCC 

TTATGGTTGG 

ATCATCAGCC 

ATCCTTCCAT 

GTCAGGCTAG 

ftftftTt 71 71 71 ft ft TV 

CGGAAAACCA 

1440 

ACTCCAGGAA 

TATCATGGCT 

CAGGGATGGG 

C TAC C TATT G 

TV TV 1 'HI 1 t TV /""I TV J - *! TV 

ACATTACAGA 

m tv /im/*i/^m tv mf\ 

TAGT C GT AT C 

1500 

AGTCAACATT 

CAACGGGAAG 

TCTACATATT 

GC C GAT TTAA 

TV /^t T\ TV TV y^l y^ltn/*i TV 

AGAAAC CTGA 

f~\ tv ^t/^/^i tv f~*\ mm 

CACCGGAGTT 

1560 

TACACTTGCA 

TTGCGAAGAA 

CGAGGATGGA 

GAGTCAACAT 

GGTCGGCATC 

TCTGACTGTT 

1620 

GAAGATCACA 

CTAGCAATGC 

ACAATTTGTT 

CGGATGCCGG 

tv mn/^Tv m^tyt tv tv 

ATCCATCGAA 

CTTCCCGTCT 

1680 

TCTCCAACGC 

tv *tv /""i it /" v i tv mm Tv m 

AAG C C ATT AT 

TGTCAATGTC 

7V nmnTi rp 7\ ft ft f* 

AC T GAT AC CG 

A 7V ftrri A /"f A ft ftrr> 

AAGTAGAGCT 

pri 71 /-irp/-i ft tv A T 1 

CCAC TGGAAT 

1740 

GCTCCCTCCA 

CATCTGGCGC 

tv yi /*i tv j^t tv tv m /*t 

AGGACCAATC 

ACTGGTTATA 

rp ft tv mrn/-i tv /~irpA 

TCATTCAGTA 

PT'7\P7\P , 'T'Pm\ 

C TAC AGT CCA 

18 0 0 

GACCTCGGAC 

AGACGTGGTT 

m TA TV TV J 1 II 1 J /— 1 /""i TV 

T AAC ATT C C A 

GACTACGTGG 

ft7\ rr\ftrrt A ftrnft a 

CATCTACTGA 

TV rp A rp A ft A A rp A 

ATATAGAATA 

1860 

AAGGGTCTGA 

AACCATCTCA 

CTCGTATATG 

TTTG TG ATT C 

/-"I TV r~1 ft TV (^1 TV TV TV TV 

GAGCAGAAAA 

rp/— t tv ^ AAA ^rim 

TGAGAAAGGT 

1920 

ATTGGAACGC 

CGAGTGTGTC 

GTCGGCTCTC 

m m tv tv /~i m tv 

GT T AC C AC TA 

GC AAG C C AG C 

tv j^i nmn tv tv /~i rn m 

AGCTCAAGTT 

1980 

GCGCTTTCTG 

ACAAGAACAA 

AATGGACATG 

GCCATCGCTG 

AGAAGAGACT 

/^l tv nri 11 n/*1A/*1 TV TV 

CACTTCGGAA 

2 040 

CAACTCATAA 

AACTCGAGGA 

AGTGAAGACT 

ATTAATTCTA 

CGGCCGTTCG 

TTTGTTCTGG 

2100 

AAGAAGAGGA 

AACTTGAAGA 

GCTGATTGAT 

GGTTACTACA 

T CAAGTGGAG 

AGGGCCTCCA 

2160 

AGAAC CAATG 

ATAATCAATA 

CGTGAATGTG 

tv /^i /-t 7\ ft ft ftftrrt tv 

AC C AGC C C TA 

y^l y*"t TV ^1 /*i TV TV TV TA 

G C AC C GAAAA 

CTATGTTGTT 

2220 

TCAAATTTAA 

TGCCATTCAC 

CAACTATGAG 

TTTTTCGTGA 

TTCCTTATCA 

TTCCGGAGTT 

2280 

CATAGTATTC 

ATGGAGCACC 

GAGTAATTCC 

ATGGACGTG T 

TGACCGCCGA 

AGCTCCACCT 

2340 

TCATTGCCAC 

CAGAGGATGT 

GCGAATCCGT 

ATGCTC AAC C 

TGACCACTCT 

TCGTATCTCT 

2400 

TGGAAAGCAC 

CAAAAGCCGA 

CGGCATCAAC 

GGAATTCTCA 

AAGGATTCCA 

AATTGTTATT 

2460 
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P,TTPPTP A A p 
br x x X L^r-irAbj 

pppppa ap a A 

b» bl b. \_ b- .rt»f-i.b. i-lr-i. 

P A A TPPPA A P 

A TP APT AP A A 
Ai bAb X i-i.L-i-ii-i 

APPIAPAPZAPIP 

TPPP APTPTT 
X brL-b*i-i.Lj X Lj X X 

0 RO Ci 
ZjZu 

rAb, lLlul 1 b-b^ 

rt.X J. IAVj X LJrAL-. 

TPP A ATP APP 
X bjLjrar-L X bXriL-br 

TATA A A ATTP 
X iA X iAALf-ii-i. X X 

PTPT A PPPPP 
bxX Lj IHbbbbb 

TAP A APrP A AT 

0 R R fl 

plp-'TPP arTTP 
X bjLj.rt.Lj X X bj 

PAPTPTPAPA 
LjiALj 1 v_ X b-JAb-ri 

TPP A A Pfl A P T 

X UO rii^L-bJ.rt.bj X 

P A APTPATP A 

TP4A ATP A AP A 
X bJi-ii-l X b-iAiAbjiA 

PAPPPTPPrA A 
b~£-iA_LJb- X \j\J-t\J-s. 

9 £A n 
z 0^ u 

A JiAP7\ PPTTP 

ArAA.L-.rt.bA_. lib 

L_ X bjb- X L-iAiAb.iA 

APA A A APPiA A 
rALjr-ir^r*i_rt.b, LjrAfA 

TP ATTTTTP^T 
Xb-iAX 111 iul 

A TPPP PTP. A T 

PA ATA A ATPT 
L^jHJA X jH-C-Lri. X b. X 


CAxCX 1 CC 1C 

i LjiA 1 lb i L_iA X 

TPTTP.P 7\ 7\ nun 
Ibi 1 bJL-r-irA X X 

PTP A TT A TTT 
L. X CAX XiAX X X 

TPPTAPTP AT 
X b-bj X i-ibj X b.iA X 

PATTATAP4PP 
L-iAX lAiAbbb 

z / ou 

XiA 1 ±C± XiACl 

ppappaatap 

C LjrlLjLjiArA X Au 

PAHA A AP APT 
L-.rt.LjrirAcAb.rAbj X 

PiATPPiA A A PP 

ATPPA A PTTT 
r\ X L-bJi-U-ibJ XXX 

TATA A AP, A TP 
lfil i-ii-lr-lAj r± X b- 

O VJ 

7A a tp a tp nan 

Art. 1 bjrt. 1 blLjrirl 

PTPTTPSTZIT 
ulul 1 b-H. 1H1 

ouL. X X L^L^rti-iX 

A A TPTTTPIPtP; 

i-Lc-iX \_ X X X VjvJbJ 

ATPiTTPPAPA 

A A A TPPP A AT 

2 8 80 

P A P A A T^P O A A 
LAbAA J. t_ L_ /Art 

TPTAPAAPAP 
X br XiAL-rtJ-iL^rtb^ 

TPIPTP4PA A PA 

X LjL- X LjLji"brt.Lj.rt. 

ATPiAPTATPiA 

±\ X bjiA.b- X Jr\ X \Jjri. 

APA ATAPiA A A 

TPPPPAPPPT 
X bJbJbf b.-ribJLJb. X 

2 94 0 

ptrnpu-p a ttppp 

C X C XrAl 1 bAbjL- 

ffi A p A pp A A A 
X LjrAb-rAL- L-rArtrA 

tppppa an ap 

X uUu ^rArtLjiAb- 

TTTTTP A A P A 
X X X X X L-iAJ-ib-jH. 

A TTPTP A TPl A 
i\ 1 iul Lj/A. X bA 

PTAPAPtTPtPA 

j U U w 

A PP* A ("PPP A P 7\ 

ACCA 1 uLAtA 

P Zl PPAPP ATP 
CACCAbjLjri 1 b. 

PP APPATP AP 
L-bjr-V.bjb.rAX LAL. 

T A TP ATT A TP 

PTPA APTPtAP 

TPPPHPAPPT 
X bJ bJ b. bTLjjrt.b- b. X 

f) £ fl 
J U D U 

pnrriTt ATPPPA 
CC 1 AA 1 C C CA 

TPTPTSPTTT 
lulL I rAv- XXX 

TTATP.PA A AP 
1 1A1 LJb3.rtxArt.b- 

PA ATATPAPP, 
b-rArt. 1 AX b-iAv^o 

ATPATPPATP 
i-i.XbJriXV_V-.j-i.X b* 

TPPATATnPP 
X b-b-r-i.Xr-l.XbJb.b- 

"31 OA 
-J Xii V 

APPAPAAPAP 

ACCACAACAC 

q-ipt pi rp pi prprjmp 

PA A PPA A PA A 
LjiAiAb. L_rArt.b- rAH. 

PP A PPTTPPP 
bbAub X X brbrb. 

TP A ATPAP A A 

A ATPPTTPPP 

rii-i X bJb, X X b-LjL, 

T 1 OA 
O XO u 

pipipi/TiAPPA a 
C C C C C ACC AA 

TP PPA A PA A A 
1 CCL.ririb..rArirA 

TPPPPTPPP A 

PPAPAPPPAP 

b.bJbib-bjL-.bri-i. X £\ 

TPP APATP AT 

X LjV_i-4.Lj.cA X L-rA X 

7 9 a n 

apppptppa a 

iAA_L-LjL-. XLjLj.rt-fA 

Pin PP;2\TPTPP4 
oALuH 1^1 L_Lj 

A TPP A PP PP.T 

PP A TPPP A TP 

PPAPAPPA AP 
<j bri-i.LjjH.LJ LJ2-ii-i.L^ 

TPTP,A ATPPP 

X b. X Vjr-tr-J- X LjLtL^ 

^ 7 n d 
j ^ u u 

pp. a (""TiT* a tp 

LjLj/AL- X L-b-rAX l_ 

A PPPPA PT AP 

iAb. L-bjLjr-l.L-. X 

PPP, A A PTP A A 

PP4PTPPPiATA 
L^bJbJ X b-bJbJjt-LX jHl 

PTPP A PPTP A 
LrXb b-rTtb-b, X K^r\ 

PAPAPtATPtTPt 

b^r^L-.r-i.LJri X LJ X Lj 

J J u u 

A HPTATPTTP 
-rtbJ L.1A1U1 X b> 

A PPTTP A PTP 
Au^ X X b-r-lL,. X L, 

r-1 X LUun X LJLJr-1 

jT1-v_. X UO X £Wj X n. 

f4TAAHGAA AH 


342 0 

pp p a p z\z\ p a p 

LubAuAALAL 

p a ppp, a at a a 

b-rt-L, b- Lj.rt-H. X rii-i. 

bJ.rt.L_, lUl OH X vJ 

Pi A PTTT A TTP 

Lxf-JA^ X X X J-i. X X O 

PHPPAPPAPP 

TTPPA ATPPA 

X X L^ L^. J-\Jr\ X v- V^i-V 

"2 ZL R n 

CCACCACC 1 C 

CACCCCACC X 

TTATPAPAPA 

PPA APT APPP 
bJb-iAi-ib- XiAbrvJb. 

PTP A PTTP A A 
LJ X b.AlbJ X X L3.HJA 

TPPTPP A APT 
X L-LJ X brLji-ir-lbj X 

T C A Pi 

A PTT 1 P 1 A PIP 1 A O 

AC 1 CCACCAC 

A APAPAPPTA 

AAuALAL CIA 

CCA 1 X Lub 1 C 

A PTP A PPP A P 

AC X bAbbbAb 

/~* T T T TP P T P P 
C X X 1 1 LjC X Cbx 

PPTTP A TPTP 
bjbr 1 1 LjiA 1 Lj X \j 

juUU 

A A TP PA 71PPP 

AA 1 C C AACC C 

PA APPAPTPP 

LAACOAb 1 CC 

PA A TPpp A AT 
bAAl LbbAAl 

TTPPP APP A A 
1 X \j C briAbj bjiAA 

PPPPPPTPA A 
CLjL-L^LjL_ X LjiAiA. 

A PPP A A A PPA 
i-ibrLjLjiAiAiAb- bjiA 

JDDU 

GACGACGATA 

GTCAGCGGTC 

TTCGTTGATG 

ATGGACGATG 

ATGGTGGATC 

TTCTGAAGCT 

3720 

GACGGGGAGA 

ACTCTGAAGG 

AGACGTTCCG 

CGTGGAGGTG 

TTAGAAAAGC 

AGTTCCTCGA 

3780 

ATGGGTATCT 

CTGCAAGTAC 

GCTGGCTCAT 

AGTTGTTACG 

GGACAAACGG 

CACTGCTCAA 

3840 

CGATTCCGGT 

CAATTCCACG 

TAACAATGGA 

ATCGTCACAC 

AAGAACAAAC 

TTGA 

3894 


(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 97 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Met Tyr Tyr Leu Gly Phe Tyr His Thr His Thr His Thr His Thr Tyr 

15 10 15 

lie Asn Phe Asp Lys lie Pro Asn Ala Ser Asn Leu Ala Pro Val lie 

20 25 30 

He Glu His Pro He Asp Val Val Val Ser Arg Gly Ser Pro Ala Thr 
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35 40 45 

Leu Asn Cys Gly Ala Lys Pro Ser Thr Ala Lys lie Thr Trp Tyr Lys 

50 55 60 

Asp Gly Gin Pro Val lie Thr Asn Lys Glu Gin Val Asn Ser His Arg 
65 70 75 80 

lie Val Leu Asp Thr Gly Ser Leu Phe Leu Leu Lys Val Asn Ser Gly 

85 90 95 

Lys Asn Gly Lys Asp Ser Asp Ala Gly Ala Tyr Tyr Cys Val Ala Ser 

100 105 110 

Asn Glu His Gly Glu Val Lys Ser Asn Glu Gly Ser Leu Lys Leu Ala 

115 120 125 

Met Leu Arg Glu Asp Phe Arg Val Arg Pro Arg Thr Val Gin Ala Leu 

130 135 140 

Gly Gly Glu Met Ala Val Leu Glu Cys Ser Pro Pro Arg Gly Phe Pro 
145 150 155 160 

Glu Pro Val Val Ser Trp Arg Lys Asp Asp Lys Glu Leu Arg lie Gin 

165 170 175 

Asp Met Pro Arg Tyr Thr Leu His Ser Asp Gly Asn Leu lie lie Asp 

180 185 190 

Pro Val Asp Arg Ser Asp Ser Gly Thr Tyr Gin Cys Val Ala Asn Asn 

195 200 205 

Met Val Gly Glu Arg Val Ser Asn Pro Ala Arg Leu Ser Val Phe Glu 

210 215 220 

Lys Pro Lys Phe Glu Gin Glu Pro Lys Asp Met Thr Val Asp Val Gly 
225 230 235 240 

Ala Ala Val Leu Phe Asp Cys Arg Val Thr Gly Asp Pro Gin Pro Gin 

245 250 255 

lie Thr Trp Lys Arg Lys Asn Glu Pro Met Pro Val Thr Arg Ala Tyr 

260 265 270 

lie Ala Lys Asp Asn Arg Gly Leu Arg lie Glu Arg Val Gin Pro Ser 

275 280 285 

Asp Glu Gly Glu Tyr Val Cys Tyr Ala Arg Asn Pro Ala Gly Thr Leu 

290 295 300 

Glu Ala Ser Ala His Leu Arg Val Gin Ala Pro Pro Ser Phe Gin Thr 
305 310 315 320 

Lys Pro Ala Asp Gin Ser Val Pro Ala Gly Gly Thr Ala Thr Phe Glu 

325 330 335 

Cys Thr Leu Val Gly Gin Pro Ser Pro Ala Tyr Phe Trp Ser Lys Glu 
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340 

Gly Gin Gin Asp Leu 
355 

Thr Lys Val Ser Pro 
370 

Val Asp Glu Gly Ala 
385 

Ser Leu Ser Lys Ala 
405 

Val Gin Lys Lys Lys 
420 

Gin Ser lie lie Lys 
435 

Ala Lys Pro Pro Pro 
450 

Met Val Gly Ser Ser 
465 

Thr Pro Gly lie Ser 
485 

Asp Ser Arg lie Ser 
500 

Leu Lys Lys Pro Asp 
515 

Asp Gly Glu Ser Thr 
530 

Ser Asn Ala Gin Phe 
545 

Ser Pro Thr Gin Pro 
565 

Leu His Trp Asn Ala 
580 

Tyr lie lie Gin Tyr 
595 

lie Pro Asp Tyr Val 
610 

Pro Ser His Ser Tyr 
625 

lie Gly Thr Pro Ser 


345 

Leu Phe Pro Ser Tyr 
360 

Thr Gly Thr Leu Thr 
375 

Tyr Val Cys Ala Gly 
390 

Ala Leu Lys Ala Thr 
410 

Ser Lys Met Gly Lys 
425 

Tyr Leu lie Ser Ala 
440 

Thr lie Glu His Gly 
455 

Ala lie Leu Pro Cys 
470 

Trp Leu Arg Asp Gly 
490 

Gin His Ser Thr Gly 
505 

Thr Gly Val Tyr Thr 
520 

Trp Ser Ala Ser Leu 
535 

Val Arg Met Pro Asp 
550 

lie lie Val Asn Val 
570 

Pro Ser Thr Ser Gly 
585 

Tyr Ser Pro Asp Leu 
600 

Ala Ser Thr Glu Tyr 
615 

Met Phe Val lie Arg 
630 

Val Ser Ser Ala Leu 

55 


350 

Val Ser Ala Asp Gly 
365 

He Glu Glu Val Arg 
380 

Met Asn Ser Ala Gly 
395 

Phe Glu Thr Lys Gly 
415 

Gin Lys Gin Lys Asn 
430 

Val Thr Gly Asn Thr 
445 

His Gin Asn Gin Thr 
460 

Gin Ala Ser Gly Lys 
475 

Leu Pro He Asp He 
495 

Ser Leu His He Ala 
510 

Cys He Ala Lys Asn 
525 

Thr Val Glu Asp His 
540 

Pro Ser Asn Phe Pro 
555 

Thr Asp Thr Glu Val 
575 

Ala Gly Pro He Thr 
590 

Gly Gin Thr Trp Phe 
605 

Arg He Lys Gly Leu 
620 

Ala Glu Asn Glu Lys 
635 

Val Thr Thr Ser Lys 


Arg 

Gin 

Ser 
400 
Arg 

Val 

Pro 

Leu 

Pro 
480 
Thr 

Asp 

Glu 

Thr 

Ser 
560 
Glu 

Gly 

Asn 

Lys 

Gly 
640 
Pro 
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Ala Ala Gin Val 
660 

Ala Glu Lys Arg 
675 

Lys Thr lie Asn 
690 

Leu Glu Glu Leu 
705 

Arg Thr Asn Asp 

Asn Tyr Val Val 
740 

Val lie Pro Tyr 
755 

Asn Ser Met Asp 
770 

Glu Asp Val Arg 
785 

Trp Lys Ala Pro 

Gin lie Val He 
820 

Thr Asn Glu Arg 
835 

Met Thr Tyr Lys 
850 

Val Ser His Gly 
865 

Lys His Leu Ala 

He Asn Lys Ser 
900 

He Phe Val Val 
915 

Asn Ser Asp Gly 
930 

Val His Met Ala 


645 

Ala Leu Ser Asp 

Leu Thr Ser Glu 
680 

Ser Thr Ala Val 
695 

He Asp Gly Tyr 
710 

Asn Gin Tyr Val 
725 

Ser Asn Leu Met 

His Ser Gly Val 
760 

Val Leu Thr Ala 
775 

He Arg Met Leu 
790 

Lys Ala Asp Gly 
805 

Val Gly Gin Ala 

Ala Ala Ser Val 
840 

He Arg Val Ala 
855 

Thr Ser Glu Val 
870 

Ala Gin Gin Glu 
885 

His Val Pro Val 

He He He Ala 
920 

Lys Asp Arg Ser 
935 

Ser Asn Asn Leu 


650 

Lys Asn Lys Met 
665 

Gin Leu He Lys 

Arg Leu Phe Trp 
700 

Tyr He Lys Trp 
715 

Asn Val Thr Ser 
730 

Pro Phe Thr Asn 
745 

His Ser He His 

Glu Ala Pro Pro 
780 

Asn Leu Thr Thr 
795 

He Asn Gly He 
810 

Pro Asn Asn Asn 
825 

Thr Leu Phe His 

Ala Arg Ser Asn 
860 

He Met Asn Gin 
875 

Asn Glu Ser Phe 
890 

He Val He Val 
905 

Tyr Cys Tyr Trp 

Phe He Lys He 
940 

Trp Asp Val Ala 


655 

Asp Met Ala He 
670 

Leu Glu Glu Val 
685 

Lys Lys Arg Lys 

Arg Gly Pro Pro 
720 

Pro Ser Thr Glu 
735 

Tyr Glu Phe Phe 
750 

Gly Ala Pro Ser 
765 

Ser Leu Pro Pro 

Leu Arg He Ser 
800 

Leu Lys Gly Phe 
815 

Arg Asn He Thr 
830 

Leu Val Thr Gly 
845 

Gly Gly Val Gly 

Asp Thr Leu Glu 
880 

Leu Tyr Gly Leu 
895 

Ala He Leu He 
910 

Arg Asn Ser Arg 
925 

Asn Asp Gly Ser 
Gin Asn Pro Asn 
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945 950 955 960 

Gin Asn Pro Met Tyr Asn Thr Ala Gly Arg Met Thr Met Asn Asn Arg 

965 970 975 

Asn Gly Gin Ala Leu Tyr Ser Leu Thr Pro Asn Ala Gin Asp Phe Phe 

980 985 990 

Asn Asn Cys Asp Asp Tyr Ser Gly Thr Met His Arg Pro Gly Ser Glu 

995 1000 1005 

His His Tyr His Tyr Ala Gin Leu Thr Gly Gly Pro Gly Asn Ala Met 

1010 1015 1020 

Ser Thr Phe Tyr Gly Asn Gin Tyr His Asp Asp Pro Ser Pro Tyr Ala 
1025 1030 1035 1040 

Thr Thr Thr Leu Val Leu Ser Asn Gin Gin Pro Ala Trp Leu Asn Asp 

1045 1050 1055 

Lys Met Leu Arg Ala Pro Ala Met Pro Thr Asn Pro Val Pro Pro Glu 

1060 1065 1070 

Pro Pro Ala Arg Tyr Ala Asp His Thr Ala Gly Arg Arg Ser Arg Ser 

1075 1080 1085 

Ser Arg Ala Ser Asp Gly Arg Gly Thr Leu Asn Gly Gly Leu His His 

1090 1095 1100 

Arg Thr Ser Gly Ser Gin Arg Ser Asp Ser Pro Pro His Thr Asp Val 
1105 1110 1115 1120 

Ser Tyr Val Gin Leu His Ser Ser Asp Gly Thr Gly Ser Ser Lys Glu 

1125 1130 1135 

Arg Thr Gly Glu Arg Arg Thr Pro Pro Asn Lys Thr Leu Met Asp Phe 

1140 1145 1150 

lie Pro Pro Pro Pro Ser Asn Pro Pro Pro Pro Gly Gly His Val Tyr 

1155 1160 1165 

Asp Thr Ala Thr Arg Arg Gin Leu Asn Arg Gly Ser Thr Pro Arg Glu 

1170 1175 1180 

Asp Thr Tyr Asp Ser Val Ser Asp Gly Ala Phe Ala Arg Val Asp Val 
1185 1190 1195 1200 

Asn Ala Arg Pro Thr Ser Arg Asn Arg Asn Leu Gly Gly Arg Pro Leu 

1205 1210 1215 

Lys Gly Lys Arg Asp Asp Asp Ser Gin Arg Ser Ser Leu Met Met Asp 

1220 1225 1230 

Asp Asp Gly Gly Ser Ser Glu Ala Asp Gly Glu Asn Ser Glu Gly Asp 

1235 1240 1245 

Val Pro Arg Gly Gly Val Arg Lys Ala Val Pro Arg Met Gly lie Ser 
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1250 1255 
Ala Ser Thr Leu Ala His Ser Cys 
1265 1270 
Arg Phe Arg Ser lie Pro Arg Asn 
1285 

Thr 


1260 

Tyr Gly Thr Asn Gly Thr Ala Gin 

1275 1280 
Asn Gly lie Val Thr Gin Glu Gin 
1290 1295 


(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 4 95 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

ATGAAATGGA AACATGTTCC TTTTTTGGTC ATGATATCAC TCCTCAGCTT ATCCCCAAAT 60 

CACCTGTTTC TGGCCCAGCT TATTCCAGAC CCTGAAGATG TAGAGAGGGG GAACGAC CAC 12 0 

GGGACGCCAA TCCCCACCTC TGATAACGAT GACAATTCGC TGGGCTATAC AGGCTCCCGT 180 

CTTCGTCAGG AAGATTTTCC AC C TCGC ATT GTTGAAC AC C CTTCAGACCT GATTGTCTCA 24 0 

AAAGGAGAAC CTGCAACTTT GAACTGCAAA GCTGAAGGCC GCCCCACACC CACTATTGAA 3 00 

TGGTACAAAG GGGGAGAGAG AGTGGAGACA GACAAAGATG ACCCTCGCTC ACACCGAATG 3 60 

TTGCTGCCGA GTGGATCTTT ATTTTTCTTA CGTATAGTAC ATGGACGGAA AAGTAGACCT 42 0 

GATGAAGGAG TCTATGTCTG TGTAGCAAGG AATTACCTTG GAGAGGCTGT GAGCCACAAT 4 80 

GCATCGCTGG AAGT AGC CAT ACTTCGGGAT GACTTCAGAC AAAACCCTTC GGATGTCATG 54 0 

GTTGCAGTAG GAGAGCCTGC AGTAATGGAA TGCCAACCTC CACGAGGCCA TCCTGAGCCC 600 

ACCATTTCAT GGAAGAAAGA TGGCTCTCCA CTGGATGATA AAGATGAAAG AATAACTATA 660 

CGAGGAGGAA AGCTCATGAT CACTTACACC CGTAAAAGTG ACGCTGGCAA ATATGTTTGT 72 0 
GTTGGTACCA ATATGGTTGG GGAACGTGAG AGTGAAGTAG CCGAGCTGAC TGTCTTAGAG 
AGACCATCAT TTGTGAAGAG ACCCAGTAAC TTGGCAGTAA CTGTGGATGA CAGTGCAGAA 

TTTAAATGTG AGGCCCGAGG TGACCCTGTA CCTACAGTAC GATGGAGGAA AGATGATGGA 90 0 

GAGCTGCCCA AATCCAGATA TGAAATCCGA GATGATCATA CCTTGAAAAT TAGGAAGGTG 96 0 

ACAGCTGGTG ACATGGGTTC ATACACTTGT GTTGCAGAAA ATATGGTGGG CAAAGCTGAA 102 0 

GCATCTGCTA CTCTGACTGT TCAAGAACCT CCACATTTTG TTGTGAAACC CCGTGACCAG 10 80 

GTTGTTGCTT TGGGACGGAC TGTAACTTTT CAGTGTGAAG CAACCGGAAA TCCTCAACCA 114 0 

GCTATTTTCT GGAGGAGAGA AGGGAGTCAG AATCTACTTT TCTCATATCA ACCACCACAG 12 0 0 

TCATCCAGCC GATTTTCAGT CTCCCAGACT GGCGACCTCA CAATTACTAA TGTCCAGCGA 12 60 

TCTGATGTTG GTTATTACAT CTGCCAGACT TTAAATGTTG CTGGAAGCAT CATCACAAAG 132 0 
GCATATTTGG AAGTTACAGA TGTGATTGCA GATCGGCCTC CCCCAGTTAT TCGACAAGGT 


780 
840 


1380 
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PPTPTPA ATP 
L- ... 1 Lj X v_T_H_--_ X V_ 

APAPTPTAPP 

PPTPPATPPP 

APTTTPPTPP 
rt\_ XXX X V-V- 

TP A nPTHTPrT 

X V_..TlA_JV_. 1VJJ.U1 

VJVjV_,V_._T.^f-lV_rVj v_. 

144 0 

a nT p p a p Tn p 

Hu 1 V_ V_-i--\_i X V_fV«_. 

PPA PP ATTPT 

V_, V_,_n. V^ V„i-_. X X \_ X 

PtTPtP, A Pt A A A p 

PATPPAPTPP 

\3r\ X VjOrtVJ X V— V_ 

TPP, TTTP A AP 

PPA APAPTPT 
(_» V_-.rt_HlVJrf-l.V_, X V_- X 

-L ZJ VJ VJ 

rip A A TP A A A P 
tuM 1 L AAAL 

APTTPPAPA A 
AG X X LiLALx.Hi-_. 

TPP A PT A PTP 
X Lroi-i.LT 1 X Li 

PAP A TPPPA T 
L ALA X L LLr A X 

ATPPTA A PPT 

jriX VjV_ X J rlrt.VjV_ X 

00 PTP A T A PT 
V_tV_jVjj X V_t_*-_. X _H.V_ X 

IDOU 

PPTOOPTA P7\ 

GG 1 LGG 1 ALA 

LL X GLAX XLrL- 

a TP a a r^cnr^n 

AX LAALLLLL 

APTPPTPA AP 
AG X LrG X LtAALj 

P A AP ATPP AP 

TOP 1 PTAP ATT 
X GL X X ALA X X 

loZU 

OA T\r«T lr T 1 r , 7\ AO 

GAAG X i UAAb 

AA XXX GGAG X 

TOP A PTTT 1 A P 
X LLALr X X LAG 

OOTOO A A O A O 
L L X L LAAGAC 

PTAPTP A PPP 
LXALXLrALLL 

A A ATTTA ATP 
AAAX X X AAX L 

icon 
1 b cs U 

POT AO TO OOO 

LL XAGXGLLL 

PA TP A A A A PP 

L-A X LArtAftLL 

TP A A PTP APA 
X LjAALj X LjAL A 

OA TO TO AO OA 
LrA X Li X LALrLA 

PA A ATAPAPT 
LrAAA X AL ALr X 

PAP ATT A TPP 
LALAX XAXLLi 

1 / 4U 

TOP njiA /-t/~l 7\ 7\ 

•n TTTp A ATTP 

APP APPA APT 

PPA A PATPTT 

AT ATT AT APA 
_-VXi-i.X X in. X i-i.V_T.f-V, 

APPPTTPAPP 
f-i\_rV_,V_ X X V_,_ri.V_TV_ 

lOUU 

P Zi TPP A T PTP 

PTAPPAPPTP 

\j X jHi.Vj ^rxvJv- X Vj 



TPAAAAPAPrA 

AAPATPTGPP 

1860 

ATTA AAPPAP 
J-i X X_n_r_x_,OV_j_r_.V_, 

X V_.i-JJ-j_rt.V_.V_, Irxrl 


P T TTTP PTTP 

v*l 1 X X X X \J 

TPtAPC^GPAGP 

TAATGPATAT 

J. _~i^n. _L Uv-rlX irz. -L 

192 0 

PP 21 A TT A PTP 
V_JVj._n-r_. X X flu X Vj 

_-_X v_,\__._-__rlVjV__ __.i-_. 

A AT ATP AP AT 

■i-\±\ X Jr\ X V_rt.V_I/-_ X 

PPAPTP A A A A 

PAPA APA TPT 

PPT A PPA APA 

17CU 

Au 1 LALLGLG 

TPP APP APA A 
X LjLt_H.V_* ^.r_.-_._r__r_. 


A P A P A P PTPP 

Jr\\Ji-L\J^i\J V_ X \_Iv_T 

PA A ATPPTPT 

TPTPP APPTP 
Iv-i V_rV__ri.V_, V- 1 v. 


LfitAAL L L LA 

L LLr ILL1 X XL. 

X X LL X L X X LL 

A TOO A A PTPP 
A X LGAAG X LrL 

A PTPP APAPT 
AL X LGALAvj X 

APATPA A PAP 
AGA X LAAL AG 

0 1 Pi A 

fPTPT, ota t a 
XL XLAGXAXA 

TA O A A OO A TA 

X ACAAGGA X A 

<"PA A A A rrimpmp 

X AAAA X X L X L 

TA TOOO OO AT 

XAX LGGLLAX 

PTOOAOOOA A 
L X Lr Lr AL L LAA 

PPAPPP AP A A 
L L AL LGAGAA 

x_lbU 

X L AGAL X GG i 

X AG X X X X X Lr A 

A P TP A O P A OP 

AG X GAGGALG 

OO A OOO A A A A 

L LAG L L AAAA 

APAPTPTPPT 
AL AL X Lr X GLj X 

A ATPPPTPAT 
AAX LLL XLrAX 

0 0 *? n 

OTP 7\n7\ A A OO 

L X L AGAAAGG 

OAOTPA APT A 
LALr X LAAL X A 

TP A A ATT A AP 
X LrAAA X X AALr 

PPTPPPPPTT 
LjL X LLtLLL X X 

TTTTT A A TP A 
X X X X XArt-XLjii 

ATTTPA APP A 
Hill V_.i-__H.Vjv_j_H. 

O O Q A 

OOAP ATAPTP 

AAA TP A A PTT 
AAA X L.._-_ALr X X 

TPPPA A A APP 
X L. L.i-ii-li-1-ra.L. V— 

PTPP A AP A AP 
L X LxLtAALjAALj 

t_~rt.V„l_V_i-lV_j J. VjV_ 

v»» v__r_.v__ v_ v_» \__H_H. 

0 _i n 

OOTOTA 71 firpp 

LL X G XAAL X G 

T A TPP A APA A 
X AX LLAALAA 

TP A TPP A A AP 
X LiA X LrLrAAAL 

PP A A PTPP A A 
LLrAAL X LrLAA 

T 'T'PT A P r I 'TAP 
X X L X ALi X I AL 

TTPPP A PPP A 
X XLLrLALLLA 

0 a n n 

LL1 L LAG AAG 

A OA prpPA AAA 

ALAL X LAAAA 

TPP A ATPPTP 
XGGAAXGGXL 

PAAPAPTATA 

L AAG AG X A X A 

A PPTTTPPTP 
AGLr XXX GG X G 

TPTPPPPA AT 
XLXGLGLAAX 

*D /I _C A 

_oU 

G AAAC T C GAT 

7\ /"-/"17V O A TO A A 

AC C AC A TC AA 

O A A A A OA OTO 

C AAAAC AG TG 

O A TOOTTOO A 

GA TGG Xx CCA 

CCX XIX CCGX 

OOTOATTOOO 

GG X CA X X CC C 

2520 

i'T*» MTOTTOTTO 

ITXC xXGX 11 

C xGGAAX CCG 

A TAOAOTOTO 

AXACAGXGTG 

OA AOTOOOA O 

GAAGT GG C AG 

00 A OO A OTOO 
CCAGCAC 1GG 

f-« /-t f-p O O O T O T 

GGC X GGG X C X 

2 5 80 

OOOOTA A AOA 

GGGG X AAAGA 

G X GALrLL X L-A 

/-tnppp/-t7\ TOO A O 

G X X LAX LLAb 

OTOOATOOOO 

L X GLjAX LrLLL 

A TO OA A A OOO 

AXLLAAALLL 

TOTOTO A OOT 

XG XGXLALL X 

0 _t /i n 

P APP A PPA AP 
_j._-iL_fVj_H.-_. U.i-i/-iAjr 

TP APPPTPPP 
X __._4.Vjv_. X L- _J\_- 

TO A OO A PA TT 

TP AP ATPTPP 

tpa "h.r > r , 7s.ncr < 

PP HCTTH ATA 

O *7 A A 

PP A PPT A TTP 
Vj7V_..if-iVj;V_7 ± jf-i X 1 VJ 

PA PPA PP PTP 
Vji-i. -J V_.i-V.Vj \— X vj 

TTPPATP ATP 

X X v3vji-i. X V X V— 

PTP ATPPTPT 

V — L V_i-i X VjVj Xui 

TPAPPATPTP 
X V_._-i.v_r\_.i-i X v_ X \J 

PPTTTATPPA 

VJV.1 I X_ri.X V— VT_H. 

i> 7 n 

__i / D \J 

PTiPPPPTi. a PA 
\___riV_. V^\jV___r__f\^_rl 

APA PA A A PPP, 

A PTT APT APT 

APPTAPPPPP 

i"_^.V_ Xi"i.V_.VJlV^VJlV_I 

PTATPAPA A A 

APTPPPPTPT 

jriw 1 _, L.V- VjT X \_, X 

Toon 

TTT A PffTT 1 ^ 

PA PPA AP APT 
LAL.L-AAL.ALi X 

A A PTT APPA P 
AAL X X A\_ L ALt 

AP APP APPPP 

A APPTPTPAP 

P APTPP A PPP 
V-T X V_3V_T>_.V_rV_7fo 

O Q OA 

_i 0 0 vJ 

A OO OOTO O A O 

AGGLL X bbAL 

TT'PT'PAAPAT 

X X C X L. AAL A X 

r , 7kr"T>r i a a opt 
L AL X LAAL L X 

LrL LLLLrL ALrL 

PA TPP PTPP P 
LA X LjLiL. X LLL 

AP AP APP TPP 
ALALALL X LrL 

O qa Pi 

LL.XAAXAL.Xtj 

(~*f~*T\ APA A PPA 

GL AAL AAL LA 

O A A TO A O TO O 

LAA X GAL X GL 

TOOATOAOOT 

X LLAX LAGLX 

LL X LLALGLL 

AOOOAATOOA 

AGL L AA X GGA 

"> A A A 
Jj u U U 

7\ AP7\ OOO AOA 

AALAGLGALA 

GCAAL L X L AL 

T A OOT A O A OT 

XALL XALAGX 

LGCCCAGC X G 

7\ rprpOT A T A O O 

AX XGXAXAGL 

A A ATT ATA AO 

AAAX XAXAAL 

iUoU 

AAL C AAC TGG 

ATAACAAACA 

A A OA A A TOTO 

AAL AAA X C xG 

A TOOTOOOTO 

ATGCTCCCTG 

AOTOA AOTOT 

AG X C AAC X G x 

TTATGGTGAT 

__ 12 U 

GTGGACC I TA 

OT"AA/^AAAA r n 

G I AALAAAA X 

O A A TO A O A TO 

C AA x GAGA x G 

AAAA OOTTO A 

AAAACCT X CA 

ATAOOOOA A A 

AX AGCCCAAA 

TOTO A A OO A T 

X C X GAAGGA X 

O T O A 

GGGCGTTTTG 

T C AA x C CAT C 

A OOOO A OOOT 

AGGGCAGCCT 

A OTOOTTA OO 

AC xCCTTACG 

OO A OO A OTO A 

CCACCAC XCA 

OOTOATOOAO 

GC X CATC CAG 

324 U 

X CAAALCT CA 

i^OAAPA 7\P7\Tl 

G C AAC AAC A X 

OA A O A A TO O O 

GAAC AA X GGC 

AGCGGGGAC X 

/-tTVl OOO AO A A 

C X GGC GAG AA 

OOAOTOOA A A 

GCAC X GGAAA 

*_• "5 A A 

i 3 0 0 

C CAC XGGGAC 

TvriP^PA A AOA 

AG C AG AAAC A 

AOA AOTOOOA 

AGAAGT GGC A 

OO A OTTO A OT 

CCAG X X CAG X 

A OA A OA TOOT 

AC AACA X CG X 

OOAOOAAAAO 

GGAG C AAAAC 

3 3 6 U 

a a o oto i\ a o a 
AAGC X GAACA 

A A {** A T jr P A r P/^ < /^ 

AAGAX XAXCG 

A O O A A A TO A O 

AG C AAA X GAC 

A OAOTTOOTO 

ACAGX X CC XC 

OA A OTA TOOO 

CAAC X AX CCC 

ATAOA A OOA A 

A X AC AACC AA 

3 42 0 

TCATACGACC 

AGAACACAGG 

AGGATCCTAC 

AACAGCT CAG 

ACCGGGGCAG 

TAGTACATCT 

3480 

GGGAGTCAGG 

GGCACAAGAA 

AGGGGCAAGA 

ACACCCAAGG 

TACCAAAACA 

GGGTGGCATG 

3540 

AACTGGGCAG 

ACCTGCTTCC 

TCCTCCCCCA 

GCACATCCTC 

CTCCACACAG 

CAATAGCGAA 

3600 

GAGTACAACA 

TTTCTGTAGA 

TGAAAGCTAT 

GAC C AAG AAA 

TGCCATGTCC 

CGTGCCACCA 

3660 
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* CCAACCA X C X 

AX 1 ibLiiALA 

TipT. rpfi Tv T\ rprp TV 

AbAlbAAl XA 

bAAbAbbAbb 

A7\r , r\ T'P* A A P*P* 

AAbA X GAACG 

A Pr'PPPPTV PT 

AGGCCCCAC X 

372 0 

CCCCCXGX 1L 

GGGGAGCAGC 

i~n rrr i"p i-p rri /"i 7\ 

GC X GCCG XGX 

/^/-""PA rp7\ /-t/-i/-i7\ 

CC XAXAGCCA 

pp/^7V/^rp/^/-^7V /-IT 1 

X CAG X C CAC X 

37 80 

GCCAC1 C 1 GA 

/**< «t> r* ft rn 
CXCCCXCCCC 

APTvPr'Tt Tv P A TV 

ACAGGAAGAA 

C X CCAGCCCA 

rp/*i rprp 7\PAPP7\ 

X G X X AC AGGA 

I X G X CCAGAG 

3 84 0 

GAG ACT GGCC 

7\ /~*7\ 1 "T S P< 7\ f~* f~* 7\ 

AC A 1GCAGCA 

pi p< 7\ pi pi pi 7\ pi 

AGGAGAC GG C 

AGC C X G TGAG 

rp /-1 rtrp y-i f~\ q-t /-1 /^-i tv 
XCCXCC XCCA 

O Q A A 

CC AC C ACGGC 

CCAX C I CCCC 

i LLALA X AC C 

X AX CGC X ACA 

XXX CACGACC 

CC X CC X C X CA 

jyou 


C C LjA I CC C C C 

AuAAbAuuAA 

pTvTvpT. Or* A A C* 

Kj AAu AC bAAb 

PPP7A fTATPPS 

LbbALAIbbA 

CC X HbLLHAb 

A A 1 A 
4t U A U 

A 1 uCAAiiL OA 

CAACCC 11 J.1 

\jl X^iCvjlvjovj 

H "T T r 1 7A PP A r* A 
Li J, bAbtAbA 

r 1 7\ pp tv 1 r* r 1 t r 1 

t—A-L-C X utL X L- 

PBPTPTTPPri 

/l A O A 

GAC C 1 GGAGA 

GC lLlbi LAL 

GGGG 1 Ct_A X G 

A X CAACGGC X 

GGGGC X CAGC 

C X C AGAGGAG 

414 0 

f~* 7\ pit TV /"* A r 1 Li T n 1 1 

GACAAC Al T X 

pnii pi /'"'P' O 7V pi pi 

/-^irp/^ /~i T\ pimp n"ir"P 

L. XL. LAG XGX X 

TV PTirnPTHTPPP 

AG X X C X 1 CGG 

TV pppprpppmrp 

ACGGC X CCT X 

X X XCACXGAX 

42 00 

GCTGACTTTG 

C C C AG GC AGT 

/"I ^1 TV /**< TV ^~l/"t/™* 

GGCAGCAGCG 

GCAGAGTATG 

CTGGTCTGAA 

TV /"ICTITV /"I f17V 7V 

AGTAGCAC GA 

42 60 

CGGCAAATGC 

T\ /-i »-i t\ (Tt/~i nmn /*i 

AGGATGCTGC 

TGGCCGTCGA 

CATTTTCATG 

CGTCTCAGTG 

CCCTAGGCCC 

4320 

TV /~i w 7\ /~1 rTI /""I /"I /—I /~t 

ACAAGTCCCG 

TGT C T ACAGA 

C AG C AACATG 

AGTGCCGCCG 

rn TV TV rn/""l /"( TV ft TV TV 

TAATGCAGAA 

TV TV /~*l 7\ TV y"~1 r"t TV 

AAC CAG AC C A 

43 8 0 

GC CAAGAAAC 

i"ppi 7i 7» 7\ pi 7\ pi pt 7\ 

TGAAACAC CA 

G C C AGGAC AT 

C T GCG C AGAG 

TV TV 7V /"I /~trp7\ /"I 71 /"I 

AAAC CTAC AC 

7\ r*7\ rp/-i 7v rn/-irprp 

AG A fG AT CTT 

444 0 

CCACCACCTC 

CTGTGCCGcC 

AC C TG C TATA 

tv tv omn 7V r~\ ^~trn tv 

AAGT C AC C T A 

(T-irp/^i ppp tv t\ frp 

CTGCCCAATC 

/~t TV TV /"I TV TV /~1 T\ At 

CAAGACACAG 

4500 

CTGGAAGTAC 

pi t\ piptrppirp7\ /~i rp 

GAC C TGTAGT 

i^i /-irTI/~l /—I TV Tl TV TV 

GGTGC C AAAA 

CTCCCTTCTA 

rp/^i 1^1 TV rp/"1 /~1 TV 7V i^l 

TGGATGCAAG 

7V 7V /~t TV 7V PTipTt 

AACAGACAGA 

4560 

TCATCAGACA 

/"I TV T\ 71 TV TV 7v f% 

GAAAAGGAAG 

/"I TV /—i 1 1 IT 1 1 TV i^l TV TV /~ 1 

CAGTTACAAG 

f~if \ /~ t TV y~1 TV TV TV 

GGGAGAGAAG 

TGTTGGATGG 

AAG ACAGG TT 

4620 

GTTGACATGC 

GAACAAATCC 

AGGTGATCCC 

AGAGAAGCAC 

AGGAACAGCA 

TV TV TV rri/*1 TV /^>~l/^/~l 

AAATGACGGG 

4680 

AAAGGACGTG 

GAAAC AAGG C 

TV r*r~* tv 7\ 7\ Tv r^r^ a 

AGCAAAACGA 

GACC r X CCAC 

TV Z^ 1 r~l 7\ TV 7\ PT\ P 

CAGCAAAGAC 

rri/-tT\ rppmpTi mp 

rc a x c rcA rc 

4740 

^ CAAGAGGATA 

TTCTACCTTA 

TTGTAGACCT 

ACTTTTCCAA 

CATCAAATAA 

TCCCAGAGAT 

4800 

CCCAGTTCCT 

CAAGCTCAAT 

GTCATCAAGA 

GGATCAGGAA 

GCAGACAAAG 

AGAACAAGCA 

4860 

N : AATGTAGGTC 

GAAGAAATAT 

TGCAGAAATG 

CAGGTACTTG 

GAGGATATGA 

AAGAGGAGAA 

4920 

u . GATAATAATG 

AAGAATTAGA 

GGAAACTGAA 

AGCTGA 



4956 


ft (2) INFORMATION FOR SEQ ID NO : 8 : 


(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1651 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

Met Lys Trp Lys His Val Pro Phe Leu Val Met lie Ser Leu Leu Ser 

15 10 15 

Leu Ser Pro Asn His Leu Phe Leu Ala Gin Leu lie Pro Asp Pro Glu 

20 25 30 

Asp Val Glu Arg Gly Asn Asp His Gly Thr Pro lie Pro Thr Ser Asp 

35 40 45 

Asn Asp Asp Asn Ser Leu Gly Tyr Thr Gly Ser Arg Leu Arg Gin Glu 


60 
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50 

Asp Phe Pro Pro Arg 
65 

Lys Gly Glu Pro Ala 
85 

Pro Thr lie Glu Trp 
100 

Asp Asp Pro Arg Ser 
115 

Phe Leu Arg lie Val 
130 

Tyr Val Cys Val Ala 
145 

Ala Ser Leu Glu Val 
165 

Ser Asp Val Met Val 
180 

Pro Pro Arg Gly His 
195 

Ser Pro Leu Asp Asp 
210 

Leu Met lie Thr Tyr 
225 

Val Gly Thr Asn Met 
245 

Thr Val Leu Glu Arg 
260 

Val Thr Val Asp Asp 
275 

Pro Val Pro Thr Val 
290 

Ser Arg Tyr Glu lie 
305 

Thr Ala Gly Asp Met 
325 

Gly Lys Ala Glu Ala 
340 

Phe Val Val Lys Pro 


55 

lie Val Glu His Pro 
70 

Thr Leu Asn Cys Lys 
90 

Tyr Lys Gly Gly Glu 
105 

His Arg Met Leu Leu 
120 

His Gly Arg Lys Ser 
135 

Arg Asn Tyr Leu Gly 
150 

Ala lie Leu Arg Asp 
170 

Ala Val Gly Glu Pro 
185 

Pro Glu Pro Thr lie 
200 

Lys Asp Glu Arg lie 
215 

Thr Arg Lys Ser Asp 
230 

Val Gly Glu Arg Glu 
250 

Pro Ser Phe Val Lys 
265 

Ser Ala Glu Phe Lys 
280 

Arg Trp Arg Lys Asp 
295 

Arg Asp Asp His Thr 
310 

Gly Ser Tyr Thr Cys 
330 

Ser Ala Thr Leu Thr 
345 

Arg Asp Gin Val Val 
61 


60 

Ser Asp Leu lie Val 
75 

Ala Glu Gly Arg Pro 
95 

Arg Val Glu Thr Asp 
110 

Pro Ser Gly Ser Leu 
125 

Arg Pro Asp Glu Gly 
140 

Glu Ala Val Ser His 
155 

Asp Phe Arg Gin Asn 
175 

Ala Val Met Glu Cys 
190 

Ser Trp Lys Lys Asp 
205 

Thr lie Arg Gly Gly 
220 

Ala Gly Lys Tyr Val 
235 

Ser Glu Val Ala Glu 
255 

Arg Pro Ser Asn Leu 
270 

Cys Glu Ala Arg Gly 
285 

Asp Gly Glu Leu Pro 
300 

Leu Lys lie Arg Lys 
315 

Val Ala Glu Asn Met 
335 

Val Gin Glu Pro Pro 
350 

Ala Leu Gly Arg Thr 


Ser 

80 

Thr 

Lys 

Phe 

Val 

Asn 
160 
Pro 

Gin 

Gly 

Lys 

Cys 
240 
Leu 

Ala 

Asp 

Lys 

Val 
320 
Val 

His 

Val 
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355 

Thr Phe Gin Cys Glu 
370 

Arg Arg Glu Gly Ser 
385 

Ser Ser Ser Arg Phe 
405 

Asn Val Gin Arg Ser 
420 

Val Ala Gly Ser lie 
435 

lie Ala Asp Arg Pro 
450 

Thr Val Ala Val Asp 
465 

Ser Pro Val Pro Thr 
485 

Thr Gin Asp Ser Arg 
500 

Arg Tyr Ala Lys Leu 
515 

Thr Pro Ser Gly Glu 
530 

Phe Gly Val Pro Val 
545 

Pro Ser Ala Pro Ser 
565 

Val Thr Leu Ser Trp 
580 

Ser Tyr lie lie Glu 
595 

Thr Val Ala Glu Asn 
610 

Lys Pro Asn Ala lie 
625 

Gly lie Ser Asp Pro 
645 

Val Leu Pro Thr Ser 


360 

Ala Thr Gly Asn Pro 
375 

Gin Asn Leu Leu Phe 
390 

Ser Val Ser Gin Thr 
410 

Asp Val Gly Tyr Tyr 
425 

lie Thr Lys Ala Tyr 
440 

Pro Pro Val lie Arg 
455 

Gly Thr Phe Val Leu 
470 

lie Leu Trp Arg Lys 
490 

lie Lys Gin Leu Glu 
505 

Gly Asp Thr Gly Arg 
520 

Ala Thr Trp Ser Ala 
535 

Gin Pro Pro Arg Pro 
550 

Lys Pro Glu Val Thr 
570 

Gin Pro Asn Leu Asn 
585 

Ala Phe Ser His Ala 
600 

Val Lys Thr Glu Thr 
615 

Tyr Leu Phe Leu Val 
630 

Ser Gin lie Ser Asp 
650 

Gin Gly Val Asp His 
62 


365 

Gin Pro Ala lie Phe 
380 

Ser Tyr Gin Pro Pro 
395 

Gly Asp Leu Thr lie 
415 

lie Cys Gin Thr Leu 
430 

Leu Glu Val Thr Asp 
445 

Gin Gly Pro Val Asn 
460 

Ser Cys Val Ala Thr 
475 

Asp Gly Val Leu Val 
495 

Asn Gly Val Leu Gin 
510 

Tyr Thr Cys lie Ala 
525 

Tyr lie Glu Val Gin 
540 

Thr Asp Pro Asn Leu 
555 

Asp Val Ser Arg Asn 
575 

Ser Gly Ala Thr Pro 
590 

Ser Gly Ser Ser Trp 
605 

Ser Ala lie Lys Gly 
620 

Arg Ala Ala Asn Ala 
635 

Pro Val Lys Thr Gin 
655 

Lys Gin Val Gin Arg 


Trp 

Gin 
400 
Thr 

Asn 

Val 

Gin 

Gly 
480 
Ser 

He 

Ser 

Glu 

He 
560 
Thr 

Thr 

Gin 

Leu 

Tyr 
640 
Asp 

Glu 

B98-006 


660 

Leu Gly Asn Ala 
675 

Ser Ser lie Glu 
690 

Gin Gly Tyr Lys 
705 

Ser Asp Trp Leu 

Val lie Pro Asp 
740 

Pro Phe Phe Asn 
755 

Lys Thr Leu Glu 
770 

Ser Lys Asn Asp 
785 

Pro Pro Glu Asp 

Cys Leu Gly Asn 
820 

Ser Thr Phe Ser 
835 

Ser Val Glu Val 
850 

Glu Pro Gin Phe 
865 

Glu Asp Gin Val 

Pro Ala Phe lie 
900 

Val Phe Ser lie 
915 

Thr Ser Thr Tyr 
930 

Pro Thr Val Thr 
945 

Arg Pro Gly Leu 


Val Leu His Leu 
680 

Val His Trp Thr 
695 

lie Leu Tyr Arg 
710 

Val Phe Glu Val 
725 

Leu Arg Lys Gly 

Glu Phe Gin Gly 
760 

Glu Ala Pro Ser 
775 

Gly Asn Gly Thr 
790 

Thr Gin Asn Gly 
805 

Glu Thr Arg Tyr 

Val Val lie Pro 
840 

Ala Ala Ser Thr 
855 

lie Gin Leu Asp 
870 

Ser Leu Ala Gin 
885 

Ala Gly He Gly 

Trp Leu Tyr Arg 
920 

Ala Gly He Arg 
935 

Tyr Gin Arg Gly 
950 

Leu Asn He Ser 


665 

His Asn Pro Thr 

Val Asp Gin Gin 
700 

Pro Ser Gly Ala 
715 

Arg Thr Pro Ala 
730 

Val Asn Tyr Glu 
745 

Ala Asp Ser Glu 

Ala Pro Pro Gin 
780 

Ala He Leu Val 
795 

Met Val Gin Glu 
810 

His He Asn Lys 
825 

Phe Leu Val Pro 

Gly Ala Gly Ser 
860 

Ala His Gly Asn 
875 

Gin He Ser Asp 
890 

Ala Ala Cys Trp 
905 

His Arg Lys Lys 

Lys Val Pro Ser 
940 

Gly Glu Ala Val 
955 

Glu Pro Ala Ala 


670 

Val Leu Ser Ser 
685 

Ser Gin Tyr He 

Asn His Gly Glu 
720 

Lys Asn Ser Val 
735 

He Lys Ala Arg 
750 

He Lys Phe Ala 
765 

Gly Val Thr Val 

Ser Trp Gin Pro 
800 

Tyr Lys Val Trp 
815 

Thr Val Asp Gly 
830 

Gly He Arg Tyr 
845 

Gly Val Lys Ser 

Pro Val Ser Pro 
880 

Val Val Lys Gin 
895 

He He Leu Met 
910 

Arg Asn Gly Leu 
925 

Phe Thr Phe Thr 

Ser Ser Gly Gly 
960 

Gin Pro Trp Leu 


63 
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965 970 975 

Ala Asp Thr Trp Pro Asn Thr Gly Asn Asn His Asn Asp Cys Ser lie 

980 985 990 

Ser Cys Cys Thr Ala Gly Asn Gly Asn Ser Asp Ser Asn Leu Thr Thr 

995 1000 1005 

Tyr Ser Arg Pro Ala Asp Cys lie Ala Asn Tyr Asn Asn Gin Leu Asp 

1010 1015 1020 

Asn Lys Gin Thr Asn Leu Met Leu Pro Glu Ser Thr Val Tyr Gly Asp 
1025 1030 1035 1040 

Val Asp Leu Ser Asn Lys lie Asn Glu Met Lys Thr Phe Asn Ser Pro 

1045 1050 1055 

Asn Leu Lys Asp Gly Arg Phe Val Asn Pro Ser Gly Gin Pro Thr Pro 

1060 1065 1070 

Tyr Ala Thr Thr Gin Leu lie Gin Ser Asn Leu Ser Asn Asn Met Asn 

1075 1080 1085 

Asn Gly Ser Gly Asp Ser Gly Glu Lys His Trp Lys Pro Leu Gly Gin 

1090 1095 1100 

Gin Lys Gin Glu Val Ala Pro Val Gin Tyr Asn lie Val Glu Gin Asn 
1105 1110 1115 1120 

Lys Leu Asn Lys Asp Tyr Arg Ala Asn Asp Thr Val Pro Pro Thr lie 

1125 1130 1135 

Pro Tyr Asn Gin Ser Tyr Asp Gin Asn Thr Gly Gly Ser Tyr Asn Ser 

1140 1145 1150 

Ser Asp Arg Gly Ser Ser Thr Ser Gly Ser Gin Gly His Lys Lys Gly 

1155 1160 1165 

Ala Arg Thr Pro Lys Val Pro Lys Gin Gly Gly Met Asn Trp Ala Asp 

1170 1175 1180 

Leu Leu Pro Pro Pro Pro Ala His Pro Pro Pro His Ser Asn Ser Glu 
1185 1190 1195 1200 

Glu Tyr Asn lie Ser Val Asp Glu Ser Tyr Asp Gin Glu Met Pro Cys 

1205 1210 1215 

Pro Val Pro Pro Ala Arg Met Tyr Leu Gin Gin Asp Glu Leu Glu Glu 

1220 1225 1230 

Glu Glu Asp Glu Arg Gly Pro Thr Pro Pro Val Arg Gly Ala Ala Ser 

1235 1240 1245 

Ser Pro Ala Ala Val Ser Tyr Ser His Gin Ser Thr Ala Thr Leu Thr 

1250 1255 1260 

Pro Ser Pro Gin Glu Glu Leu Gin Pro Met Leu Gin Asp Cys Pro Glu 


64 
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1265 1270 1275 1280 

Glu Thr Gly His Met Gin His Gin Pro Asp Arg Arg Arg Gin Pro Val 

1285 1290 1295 

Ser Pro Pro Pro Pro Pro Arg Pro lie Ser Pro Pro His Thr Tyr Gly 

1300 1305 1310 

Tyr lie Ser Gly Pro Leu Val Ser Asp Met Asp Thr Asp Ala Pro Glu 

1315 1320 1325 

Glu Glu Glu Asp Glu Ala Asp Met Glu Val Ala Lys Met Gin Thr Arg 

1330 1335 1340 

Arg Leu Leu Leu Arg Gly Leu Glu Gin Thr Pro Ala Ser Ser Val Gly 
1345 1350 1355 1360 

Asp Leu Glu Ser Ser Val Thr Gly Ser Met lie Asn Gly Trp Gly Ser 

1365 1370 1375 

Ala Ser Glu Glu Asp Asn lie Ser Ser Gly Arg Ser Ser Val Ser Ser 

1380 1385 1390 

Ser Asp Gly Ser Phe Phe Thr Asp Ala Asp Phe Ala Gin Ala Val Ala 

1395 1400 1405 

Ala Ala Ala Glu Tyr Ala Gly Leu Lys Val Ala Arg Arg Gin Met Gin 

1410 1415 1420 

Asp Ala Ala Gly Arg Arg His Phe His Ala Ser Gin Cys Pro Arg Pro 
1425 1430 1435 1440 

Thr Ser Pro Val Ser Thr Asp Ser Asn Met Ser Ala Ala Val Met Gin 

1445 1450 1455 

Lys Thr Arg Pro Ala Lys Lys Leu Lys His Gin Pro Gly His Leu Arg 

1460 1465 1470 

Arg Glu Thr Tyr Thr Asp Asp Leu Pro Pro Pro Pro Val Pro Pro Pro 

1475 1480 1485 

Ala lie Lys Ser Pro Thr Ala Gin Ser Lys Thr Gin Leu Glu Val Arg 

1490 1495 1500 

Pro Val Val Val Pro Lys Leu Pro Ser Met Asp Ala Arg Thr Asp Arg 
1505 1510 1515 1520 

Ser Ser Asp Arg Lys Gly Ser Ser Tyr Lys Gly Arg Glu Val Leu Asp 

1525 1530 1535 

Gly Arg Gin Val Val Asp Met Arg Thr Asn Pro Gly Asp Pro Arg Glu 

1540 1545 1550 

Ala Gin Glu Gin Gin Asn Asp Gly Lys Gly Arg Gly Asn Lys Ala Ala 

1555 1560 1565 

Lys Arg Asp Leu Pro Pro Ala Lys Thr His Leu lie Gin Glu Asp lie 


65 
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1570 1575 1580 

Leu Pro Tyr Cys Arg Pro Thr Phe Pro Thr Ser Asn Asn Pro Arg Asp 
1585 1590 1595 1600 

Pro Ser Ser Ser Ser Ser Met Ser Ser Arg Gly Ser Gly Ser Arg Gin 

1605 1610 1615 

Arg Glu Gin Ala Asn Val Gly Arg Arg Asn lie Ala Glu Met Gin Val 

1620 1625 1630 

Leu Gly Gly Tyr Glu Arg Gly Glu Asp Asn Asn Glu Glu Leu Glu Glu 

1635 1640 1645 

Thr Glu Ser 
1650 


(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 130 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

( D ) TOPOLOGY : 1 i ne ar 
(ii) MOLECULE TYPE: cDNA 
(ix) FEATURE: 

(A) NAME/ KEY : misc_f eature 

(B) LOCATION: 855. .1187 

(D) OTHER INFORMATION: /note= "N signifies gap in sequence" 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 


CAGATTGTTG 

CTCAAGGTCG 

AACAGTGACA 

TTTCCCTGTG 

AAACTAAAGG 

AAAC CCACAG 

60 

CCAGCTGTTT 

TTTGGCAGAA 

AGAAGGCAGC 

CAGAACCTAC 

TTTTCCCAAA 

CCAACCCCAG 

120 

CAGCCCAACA 

GTAGATGCTC 

AGTGTCACCA 

ACTGGAGACC 

TCACAATCAC 

CAACATTCAA 

180 

CGTTCCGACG 

CGGGTTACTA 

CATCTGCCAG 

GCTTTAACTG 

TGGCAGGAAG 

CATTTTAGCA 

240 

AAAGCTCAAC 

TGGAGGTTAC 

TGATGTTTTG 

ACAGATAGAC 

CTCCACCTAT 

AATTCTACAA 

300 

GGCCCAGCCA 

ACCAAACGCT 

GGCAGTGGAT 

GGTACAGCGT 

TACTGAAATG 

TAAAGC CACT 

360 

GGTGATC CTC 

TTCCTGTAAT 

TAGCTGGTTA 

AAGGAGGGAT 

TTACTTTTCC 

GGGTAGAGAT 

420 

CCAAGAGCAA 

CAATTCAAGA 

GCAAGGCACA 

CTGCAGATTA 

AGAATTTACG 

GATTTCTGAT 

480 

ACTGGCACTT 

ATACTTGTGT 

GGCTACAAGT 

TCAAGTGGAG 

AGGCTTCCTG 

GAGTGCAGTG 

540 

CTGGATGTGA 

CAGAGTCTGG 

AGCAACAATC 

AGTAAAAACT 

ATGAT TTAAG 

TGACCTGCCA 

600 

GGGCCACCAT 

CCAAACCGCA 

AGTCACTGAT 

GTTACTAAGA 

ACAGTGTCAC 

CTTGTCCTGG 

660 

CAGCCAGGTA 

CCCCTGGAAC 

CCTTCCAGCA 

AGTGCATATA 

TCATTGAGGC 

TTTCAGCCAA 

720 

T C AGTGAGC A 

ACAGCTGGCA 

GACCGTGGCA 

AAC CATGTAA 

AGACCACCCT 

CTATACTGTA 

780 

AGAGGACTGC 

GGCCCAATAC 

AATCTACTTA 

TTCATGGTCA 

GAGCGATCAA 

CCCCAAGGTY 

840 
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TCAGTGACCC 

AAGTNAAACC 

ACAGAAAAAC 

AATGGATCCA 

CTTGGGCCAA 

TGTCCCTCTA 

900 

CCTCCCCCCC 

CAGTCCAGCC 

CCTTCCTGGC 

ACGGAGCTGG 

AACACTATGC 

AGTGGAACAA 

960 

CAAGAAAATG 

GCTATGACAG 

TGATAGCTGG 

TGCCCACCAT 

TGCCAGTACA 

AACTTACTTA 

1020 

CACCAAGGTC 

TGGAAGATGA 

ACTGGAAGAA 

GATGATGATA 

GGGTCCCAAC 

ACCTCCTGTT 

1080 

CGAGGCGTGG 

CTTCTTCTCC 

TGCTATCTCC 

TTTGGACAGC 

AGTCCACTGC 

AACTCTTACT 

1140 

CCATCCCCAC 

GGGAAGAGAT 

GCAACCCATG 

CTGCAGGCTT 

CACCTNTTTA 

CCTCCTCTCA 

1200 

AAGACCTCGA 

CCTACCAGCC 

CATTTTCTAC 

TGACAGTAAC 

ACCAGTGCAG 

CCCTGAGTCA 

1260 

AAGTCAGAGG 

CCTCGGCCCA 

CTAAAAAACA 

CAAGGGAGGG 



1300 


(2) INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 4 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 
(ii) MOLECULE TYPE: peptide 
(ix) FEATURE: 

(A) NAME /KEY : Modified- site 

(B) LOCATION: 285 . .396 

(D) OTHER INFORMATION: /note= "Xaa signifies gap in sequence" 
Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Gin lie Val Ala Gin Gly Arg Thr Val Thr Phe Pro Cys Glu Thr Lys 

15 10 15 

Gly Asn Pro Gin Pro Ala Val Phe Trp Gin Lys Glu Gly Ser Gin Asn 

20 25 30 

Leu Leu Phe Pro Asn Gin Pro Gin Gin Pro Asn Ser Arg Cys Ser Val 

35 40 45 

Ser Pro Thr Gly Asp Leu Thr lie Thr Asn lie Gin Arg Ser Asp Ala 

50 55 60 

Gly Tyr Tyr lie Cys Gin Ala Leu Thr Val Ala Gly Ser lie Leu Ala 
65 70 75 80 

Lys Ala Gin Leu Glu Val Thr Asp Val Leu Thr Asp Arg Pro Pro Pro 

85 90 95 

lie lie Leu Gin Gly Pro Ala Asn Gin Thr Leu Ala Val Asp Gly Thr 

100 105 110 

Ala Leu Leu Lys Cys Lys Ala Thr Gly Asp Pro Leu Pro Val lie Ser 

115 120 125 

Trp Leu Lys Glu Gly Phe Thr Phe Pro Gly Arg Asp Pro Arg Ala Thr 
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130 

lie Gin Glu Gin 
14 5 

Thr Gly Thr Tyr 

Trp Ser Ala Val 
180 

Asn Tyr Asp Leu 
195 

Thr Asp Val Thr 
210 

Pro Gly Thr Leu 
225 

Ser Val Ser Asn 

Leu Tyr Thr Val 
260 

Val Arg Ala lie 
275 

Lys Asn Asn Gly 
290 

Val Gin Pro Leu 
305 

Gin Glu Asn Gly 

Gin Thr Tyr Leu 
340 

Asp Arg Val Pro 
355 

lie Ser Phe Gly 
370 

Glu Glu Met Gin 
385 

Gin Arg Pro Arg 

Ala Ala Leu Ser 
420 

Gly Gly 


135 

Gly Thr Leu Gin 
150 

Thr Cys Val Ala 
165 

Leu Asp Val Thr 

Ser Asp Leu Pro 
200 

Lys Asn Ser Val 
215 

Pro Ala Ser Ala 
230 

Ser Trp Gin Thr 
245 

Arg Gly Leu Arg 

Asn Pro Lys Val 
280 

Ser Thr Trp Ala 
295 

Pro Gly Thr Glu 
310 

Tyr Asp Ser Asp 
325 

His Gin Gly Leu 

Thr Pro Pro Val 
360 

Gin Gin Ser Thr 
375 

Pro Met Leu Gin 
390 

Pro Thr Ser Pro 
405 

Gin Ser Gin Arg 


140 

lie Lys Asn Leu 
155 

Thr Ser Ser Ser 
170 

Glu Ser Gly Ala 
185 

Gly Pro Pro Ser 

Thr Leu Ser Trp 
220 

Tyr lie lie Glu 
235 

Val Ala Asn His 
250 

Pro Asn Thr lie 
265 

Ser Val Thr Gin 

Asn Val Pro Leu 
300 

Leu Glu His Tyr 
315 

Ser Trp Cys Pro 
330 

Glu Asp Glu Leu 
345 

Arg Gly Val Ala 

Ala Thr Leu Thr 
380 

Ala Ser Pro Xaa 
395 

Phe Ser Thr Asp 
410 

Pro Arg Pro Thr 
425 


Arg lie Ser Asp 
160 

Gly Glu Ala Ser 
175 

Thr lie Ser Lys 
190 

Lys Pro Gin Val 
205 

Gin Pro Gly Thr 

Ala Phe Ser Gin 
240 

Val Lys Thr Thr 
255 

Tyr Leu Phe Met 
270 

Xaa Lys Pro Gin 
285 

Pro Pro Pro Pro 

Ala Val Glu Gin 
320 

Pro Leu Pro Val 
335 

Glu Glu Asp Asp 
350 

Ser Ser Pro Ala 
365 

Pro Ser Pro Arg 

Phe Thr Ser Ser 
400 

Ser Asn Thr Ser 
415 

Lys Ly s His Ly s 
430 
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(2) INFORMATION FOR SEQ ID NO : 11 : 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 444 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GCCCAGGCAG TTGCTGCAGC TGCGGAGTAT GCGGGCCTGA AAGTGGCTCG CCGCCAAATG 6 0 

CAAGATGCTG CTGGCCGCCG CCACTTCCAT GCCTCTCAGT GCCCAAGGCC CACGAGTCCT 12 0 

GTGTCCACAG ACAGCAACAT GAGTGCTGTT GTGATC CAGA AAGC CAGAC C CGCCAAGAAG 180 
CAGAAACACC AGCCAGGACA TCTGCGCAGG GAAGC CTACG CAGATGATCT TCCACCCCCT 24 0 

CCAGTGCCAC CACCTGCTAT AAAATCGCCC ACTGTCCAGT CCAAGGCACA GCTGGAGGTA 3 00 

CGGCCTGTCA TGGTGCCAAA ACTCGCGTCT ATAGAAGCAA GGACAGATAG ATCGTCAGAC 3 60 

AGAAAAGGAG GCAGTTACAA GGGGAGAGAA GCTCTGGATG GAAGACAAGT CACTGACCTG 42 0 

CGAACAAATC CAAGTGACCC CAGA 444 


(2) INFORMATION FOR SEQ ID NO : 12 : 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 
(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 12 : 

Ala Gin Ala Val Ala Ala Ala Ala Glu Tyr Ala Gly Leu Lys Val Ala 

15 10 15 

Arg Arg Gin Met Gin Asp Ala Ala Gly Arg Arg His Phe His Ala Ser 

20 25 30 

Gin Cys 'Pro Arg Pro Thr Ser Pro Val Ser Thr Asp Ser Asn Met Ser 

35 4 0 4 5 

Ala Val Val lie Gin Lys Ala Arg Pro Ala Lys Lys Gin Lys His Gin 

50 55 60 

Pro Gly His Leu Arg Arg Glu Ala Tyr Ala Asp Asp Leu Pro Pro Pro 
65 70 75 80 

Pro Val Pro Pro Pro Ala lie Lys Ser Pro Thr Val Gin Ser Lys Ala 
85 90 95 
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Gin Leu Glu Val 
100 

Ala Arg Thr Asp 
115 

Arg Glu Ala Leu 
130 

Ser Asp Pro Arg 
145 


Arg Pro Val Met 

Arg Ser Ser Asp 
120 

Asp Gly Arg Gin 
135 


Val Pro Lys Leu 
105 

Arg Lys Gly Gly 

Val Thr Asp Leu 
14 0 


Ala Ser lie Glu 
110 

Ser Tyr Lys Gly 
12 5 

Arg Thr Asn Pro 
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